Mitc.Support.FuzzySearch.EntityFramework 1.0.0
Mitc.Support.FuzzySearch.EntityFramework
SQL Server fuzzy / approximate string matching for EF Core. Two layers in one package, no UDF deployment required.
- Layer 1 — translatable primitives: three SQL Server built-ins (
SOUNDEX,DIFFERENCE,PATINDEX) exposed as LINQ-translatablestringextensions. Compose intoWhere/OrderByclauses; the SQL stays SQL. - Layer 2 — in-memory edit-distance terminator: an
IQueryable<T>extension that injects a lossless SQL length-window pre-filter, materializes a small candidate set, computes Levenshtein distance in C#, and returns matches ranked by distance.
Install
dotnet add package Mitc.Support.FuzzySearch.EntityFramework
Setup
Register the Layer 1 functions once in OnModelCreating:
using Mitc.Support.FuzzySearch;
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.RegisterFuzzySearchFunctions();
}
Layer 1 — translatable primitives
using Mitc.Support.FuzzySearch;
// Phonetic equality — translates to: SOUNDEX(Name) = SOUNDEX(@term)
context.Users.Where(u => u.Name.SoundsLike("Smith"));
// 0-4 phonetic similarity (DIFFERENCE) — translates to: DIFFERENCE(Name, @term)
context.Users.Where(u => u.Name.PhoneticDistanceFrom("Smith") >= 3);
context.Users.OrderByDescending(u => u.Name.PhoneticDistanceFrom("Smith"));
// Pattern position — translates to: PATINDEX(@pattern, Name)
// Returns 1-based position of first match, 0 if no match.
// Return type is `long` (not `int`) so the result handles PATINDEX's `bigint`
// return on (n)varchar(max) / varbinary(max) columns without runtime cast errors.
context.Users.Where(u => u.Name.IsLike("%Mit%") > 0);
context.Users.OrderBy(u => u.Name.IsLike("%Mit%")); // earlier match comes first
Layer 2 — in-memory Levenshtein terminator
// Returns matches with distance <= maxDistance, ordered by distance ascending.
List<User> matches = await context.Users
.ToListByEditDistanceAsync(u => u.Name, "John", maxDistance: 2);
// Same selection, but each result is paired with its computed edit distance:
List<EditDistanceMatch<User>> ranked = await context.Users
.ToMatchesByEditDistanceAsync(u => u.Name, "John", maxDistance: 2);
The terminator first runs WHERE LEN(Name) BETWEEN @term.Length - 2 AND @term.Length + 2, materializes the (now small) candidate set, then computes Levenshtein in C#. The length pre-filter is mathematically lossless on non-NULL values because Levenshtein distance ≤ k requires |len_a − len_b| ≤ k. Rows where the selector value is NULL are excluded by the pre-filter and do not match.
When to use which
| Use case | Recommended primitives |
|---|---|
| User search by name | SoundsLike and/or PhoneticDistanceFrom (tight threshold), or ToListByEditDistanceAsync with low maxDistance for typo tolerance |
| Blog post / article search | IsLike for substring matching, optionally combined with ToListByEditDistanceAsync on the title for typo tolerance |
| Document / longer-text search | IsLike with looser patterns; full-text search is out of scope for this package |
Performance: composing Layer 1 in front of Layer 2
The Layer 2 terminator's only built-in pre-filter is the lossless length window. On very large tables (millions of rows), the length window alone may still leave a lot of candidates to materialize. Because Layer 1 primitives are translatable, you can chain them in front of ToListByEditDistanceAsync and they stay in SQL:
var matches = await context.InspectionObservations
.Where(io => io.Observation.Name.PhoneticDistanceFrom(term) >= 2) // Layer 1, in SQL
.Include(io => io.Observation) // selector reads nav prop
.ToListByEditDistanceAsync(io => io.Observation.Name, term, maxDistance: 2);
The Where(... PhoneticDistanceFrom(term) >= 2) pushes a DIFFERENCE(...) check to the database before the package's length pre-filter runs, dramatically cutting candidates ahead of materialization. Tighter phonetic thresholds (>= 3 or >= 4) cut more aggressively but have a small false-negative rate on edit-close-but-phonetically-distant pairs; >= 2 is a safe default.
For the case where the searched string lives on a small lookup joined to a fact table, the structurally-correct pattern is to run the edit-distance match against the lookup first and then filter the fact table by FK — this avoids running Levenshtein once per fact-table row.
Versioning
Multi-targeted from net5.0 through net10.0. The EF Core dependency is pinned to the matching major per target.
No packages depend on Mitc.Support.FuzzySearch.EntityFramework.
.NET 5.0
- Microsoft.EntityFrameworkCore (>= 5.0.0 && < 6.0.0)
- Microsoft.EntityFrameworkCore.Relational (>= 5.0.0 && < 6.0.0)
.NET 6.0
- Microsoft.EntityFrameworkCore (>= 6.0.0 && < 7.0.0)
- Microsoft.EntityFrameworkCore.Relational (>= 6.0.0 && < 7.0.0)
.NET 7.0
- Microsoft.EntityFrameworkCore (>= 7.0.0 && < 8.0.0)
- Microsoft.EntityFrameworkCore.Relational (>= 7.0.0 && < 8.0.0)
.NET 8.0
- Microsoft.EntityFrameworkCore (>= 8.0.0 && < 9.0.0)
- Microsoft.EntityFrameworkCore.Relational (>= 8.0.0 && < 9.0.0)
.NET 9.0
- Microsoft.EntityFrameworkCore (>= 9.0.0 && < 10.0.0)
- Microsoft.EntityFrameworkCore.Relational (>= 9.0.0 && < 10.0.0)
.NET 10.0
- Microsoft.EntityFrameworkCore (>= 10.0.0 && < 11.0.0)
- Microsoft.EntityFrameworkCore.Relational (>= 10.0.0 && < 11.0.0)
| Version | Downloads | Last updated |
|---|---|---|
| 1.0.0 | 0 | 5/1/2026 |