-
Notifications
You must be signed in to change notification settings - Fork 330
📢 Proposal: Introducing SqlType SqlVectorFloat32 for SQL Server's New VECTOR Data Type
#3382
-
Status: Proposed
Related Documentation: VECTOR (Transact-SQL)
🎯 Summary
With SQL Server's new support for the VECTOR data type, we propose adding first-class support for interacting with it through a new public API: SqlVectorFloat32. This type will be included in the Microsoft.Data.SqlTypes namespace in SqlClient.
This proposal introduces:
- A new
SqlVectorFloat32class that provides a managed representation of theVECTORSQL type. - Full support for reading and writing vector values using
SqlParameter,SqlDataReaderandSqlBulkCopy. - Serialization and deserialization logic that complies with the TDS wire format for
VECTOR.
💡 Motivation
The new VECTOR type in SQL Server enables machine learning, embedding, and AI scenarios by providing optimized storage for fixed-length float vectors. However, the vector format requires that even NULL values maintain metadata or the output parameters of type vector maintain metadata such as:
LengthElementType
Due to these requirements, there is no direct CLR equivalent for the SQL Server VECTOR type (i.e., no float[] mapping). As such, we need to introduce a new structured type—SqlVectorFloat32—to handle this correctly. Also, we do not want to map float[] directly to vector datatype for Float32 since it can create ambiguity if and when Sql server introduces datatype for CLR type float[].
✅ Proposed API Surface
public sealed class SqlVectorFloat32 : INullable { //Initializes SqlVectorFloat32 instance which represents vector column of length indicated by length for Float32 values public SqlVectorFloat32(int length); // Initializes SqlVectorFloat32 instance with Float32 data intended for working with vector column of length indicated by the length of values public SqlVectorFloat32(ReadOnlyMemory<float> values); // // Provides access to the Float32 data represented by the SqlVectorFloat32 instance. Returns Empty array if the SqlVectorFloat32 instance was not initialized with any Float32 data. public float[] Values; // Returns Float32 data as a JSON string representing the Float32 data. public override string ToString(); // Indicates whether the current instance is representing a Null value. This returns true if the SqlVectorFloat32 instance was not initialized with Float32 data. public bool IsNull { get; } // Required for maintaining the compatibility with DataTable. Points to `null` public static SqlVectorFloat32 Null { get; } // Indicates the length of vector column for which the current SqlVectorFloat32 instance is suitable for public int Length { get; } //Indicates the potential size in bytes of the SqlVectorFloat32 instance for a n-length vector column where n is Length of SqlVectorFloat32 instance. Useful for Prepare() functionality where SqlParameters size needs to be initialized explicitly. public int Size { get; } }
🧪 Sample Usage of SqlVectorFloat32
Below are sample code snippets that illustrate how to use SqlVectorFloat32 with various ADO.NET operations.
📥 Inserting Values into a Table
Insert non-null vector (explicit SqlDbType):
var command = new SqlCommand("INSERT INTO dbo.vectors (id, v) VALUES (@Id, @VectorData)", conn); command.Parameters.AddWithValue("@Id", 1); var vectorParam = new SqlParameter("@VectorData", SqlDbTypeExtensions.Vector); vectorParam.Value = new SqlVectorFloat32(new float[] { 1.1f, 2.2f, 3.3f }); command.Parameters.Add(vectorParam); command.ExecuteNonQuery();
Or using
var vectorParam = new SqlParameter("@VectorData", new SqlVectorFloat32(new float[] { 1.1f, 2.2f, 3.3f }));
Or using AddWithValue
command.Parameters.AddWithValue("@VectorData", new SqlVectorFloat32(new float[] { 1.1f, 2.2f, 3.3f });
Inserting Null values with preserved metadata:
var command = new SqlCommand("INSERT INTO dbo.vectors (id, v) VALUES (@Id, @VectorData)", conn); command.Parameters.AddWithValue("@Id", 2); // Still provide the element count, even for null var vectorParam = new SqlParameter("@VectorData", SqlDbTypeExtensions.Vector); vectorParam.Value = new SqlVectorFloat32(3); // Null vector with dimension command.Parameters.Add(vectorParam); command.ExecuteNonQuery();
Reading Vector Values From Sql Server Table With SqlDataReader.GetSqlVectorFloat32()
using (SqlDataReader reader = command.ExecuteReader()) { while (reader.Read()) { SqlVectorFloat32 vector = reader.GetSqlVectorFloat32(0); } }
Or using DbDataReader.GetValue()
SqlVectorFloat32vector = (SqlVectorFloat32)reader.GetValue(0);
Using Vectors in StoredProcedure
CREATE PROCEDURE CopyVectorParamData @InputData vector(3), @OutputData vector(3) OUTPUT AS BEGIN SET @OutputData = @InputData; END
var command = new SqlCommand("CopyVectorParamData", conn); command.CommandType = CommandType.StoredProcedure; var inputParam = new SqlParameter("@InputData", SqlDbTypeExtensions.Vector) { Value = new SqlVectorFloat32(new float[] { 4.4f, 5.5f, 6.6f }) }; var outputParam = new SqlParameter("@OutputData", SqlDbTypeExtensions.Vector) { Value = new SqlVectorFloat32(3), //Declare output parameter for a vector column that accepts only 3 elements Direction = ParameterDirection.Output }; command.Parameters.Add(inputParam); command.Parameters.Add(outputParam); command.ExecuteNonQuery(); var result = (SqlVectorFloat32)outputParam.Value; Console.WriteLine(string.Join(", ", result.ToArray()));
Beta Was this translation helpful? Give feedback.
All reactions
-
🚀 3
Replies: 5 comments 28 replies
-
@roji Your inputs would be helpful. We would like to understand if there are any concerns from EF side in consuming the proposed API.
CC: @dotnet/sqlclientdevteam
Beta Was this translation helpful? Give feedback.
All reactions
-
Thanks for the proposal and ping @apoorvdeshmukh - it's great to see this effort starting.
I don't there's much to say from an EF perspective - it should be able to support whatever type you decide to go with; in other words, users would be able to have a SqlVectorFloat32 property on their .NET type, and EF would read and write that to the database as with any other type. I'm a bit considered about the odd requirements around NULL though: EF typically just sends DBNull.Value for nulls - that works with all other types, but it seems like vector will be a new, special exception to that. Hopefully this won't be a blocker.
Below are various thoughts. Once we're more baked around the API shape, I'd recommend we bring this to the API design panel - these are the folks that discuss all changes to BCL APIs etc. Although SqlClient isn't part of the BCL this isn't mandatory, but they can provide very valuable advice around all the issues we're discussing.
- Agree with @edwardneal that sealing SqlVectorFloat32 is a good idea.
- In fact, you may want to consider making the type a struct rather than a class. It's very small - so a good candidate for a struct - and this would save the needless extra allocation. If you do this, I'd recommend making this a
readonly struct, as mutable structs are hard to work with. We can also consult with .NET API designers on this question if we want. - I highly recommend accepting
ReadOnlyMemory<float>rather than justfloat[]on the type;ROM<float>allows representing a slice of an array, as well as unmanaged memory, and would allow more efficient usage. Note that an implicit cast exists fromT[]toROM<T>. - Does
SqlVectorFloat32.ToArray()copy the array out, or expose access to the original array? In general, is SqlVectorFloat32 intended to be a very thin wrapper above user data (I certainly hope so) rather than copying the user's data into an internal buffer?- It seems quite important to just expose the data, rather than copying. If this copies, then users have no way of reading a vector from SQL Server without an additional expensive copy.
- ToArray() in .NET generally connotates a copy rather than simple exposing. To expose, the more natural thing would likely be a simple property (a method like ToArray suggests potentially heavy processing, while a property suggests simple immediate access).
- If ISqlVector is a public interface, then it would be good to understand what it does, the motivations etc. (OTOH I'm not sure something like that makes sense). Of course, if it's internal then no problem at all (usually internal interfaces are left out from (public) API proposals).
- Naming: since the SQL Server type is called
vector, I'd recommend calling the .NET type SqlVector rather than SqlVectorFloat32.- Unless I'm mistaken, existing Sql* types mirror the naming of the SQL Server type they represent. This makes things easier to understand for users, who may wonder if SqlVectorFloat32 is the right .NET type for interacting with SQL Server
vector. - If other vector types are introduced in the future, I expect they'll have to have different names (
half_vector?), or possibly specify the type as a facet (vector(dim, underlying_type)) - in both cases I think you want to maintain parity between the SqlClient and database names (and also make the client-side type name shorter)
- Unless I'm mistaken, existing Sql* types mirror the naming of the SQL Server type they represent. This makes things easier to understand for users, who may wonder if SqlVectorFloat32 is the right .NET type for interacting with SQL Server
- Re
SqlDataReader.GetSqlVectorFloat32():- Note that it's already possible to read any value type via
SqlDataReader.GetFieldValue<T>()andSqlDataReader.GetFieldValueAsync<T>(), so there's no specific need to introduceGetSqlVectorFloat32(). GetSqlVectorFloat32()would only allow synchronous reading of a vector; to asynchronously read it, users would still have to useGetFieldValueAsync<T>, unless you also introduceGetSqlVectorFloat32Async(). There are no asynchronous type-specific getters like this on SqlDataReader: the story here is probably that the sync getters (GetInt32(),GetString()...) were originally introduced back in .NET 1.0, before .NET had generics (soGetFieldValue<T>()simply didn't exist and there was no other option). OnceGetFieldValue<T>()was introduced, such type-specific getters are no longer really necessary.- I'll also point out that DateOnly and TimeOnly support was added to SqlClient without type-specific getters (i.e. there's no
SqlDataReader.GetDateOnly()). So my recommendation would be to not introduceGetSqlVectorFloat32()either. - If you do introduce it, then depending on what you decide on the type naming question above you may want to rename this so
SqlDataReader.GetVector()instead ofSqlDataReader.GetSqlVectorFloat32().
- Note that it's already possible to read any value type via
- Null values
- The proposed API for creating a null value is:
new SqlVectorFloat32(3)(null value for a 3-dimensional vector). There's nothing indicating that this API creates a null value, as opposed to e.g. a 3-dimensional vector that can then be populated with data by the user - so I can imagine a bit of user confusion. Would this - There's also a static Null property for a 0-dimensional null vector, apparently for DataTable compatibility - but I can't imagine any scenario where this would be useful. Can you provide a bit more context on why this is needed?
- Instead of the above 2 APIs, would it make sense to instead of a static factory method, e.g. CreateNull() accepting the number of dimensions? This would make it very clear what the API does etc.
- BTW is there an example of another SQL Server type where a null value still needs to carry some facet information (like the dimensions in the vector case)? Because OTOH all other SQL Server types have a simple, single null value (which is how it ideally should be), rather than requiring this sort of extra information.
- The proposed API for creating a null value is:
@edwardneal I'm including some thoughts on your comments below, in an answer on your comment.
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 3
-
Thanks @roji. I can see your answers, but to provide a little more context to the original point:
- Type names: broadly agree with a name of
SqlVector. If we think it's possible that SQL Server could introduce avector(dim[, underlying_type])variant, we might want to have a baseSqlVector<T>class or anISqlVector<T>interface. This would let us tidily mapSqlVectortovector(3), andSqlVector<int>tovector(3, int). - Memory ownership: completely agree that
SqlVectorshould wrap caller-supplied memory rather than perform copies. - Nullability:
- There's an implicit assumption in System.Data and in SqlClient that anything which implements
INullablewill also have a static property or field namedNull. I've assumed it's the only reason why we need this field, and that this type implementsINullablefor consistency with the rest of SqlTypes and for compatibility with DataTables. DataTablecompatibility is important because it's one way for users to pass table-valued parameters to SqlClient and to perform bulk copies.- The closest parallel to this type might be
SqlChars(which needs to encode the length of a SQL Serverchar(X)). This has no way to set a null value with a specific length. I'm inclined to say that if aSqlVectoris null, its length should similarly be ignored. Clients will inspect theLengthproperty to get the number of dimensions in the value, not in the schema type; the fact that for a non-null value these will always be equal is an implementation detail. If the value is null,Lengthshould throw aSqlNullValueException. Clients should checkIsNullbefore dealing with the value.
- There's an implicit assumption in System.Data and in SqlClient that anything which implements
On one slightly broader comment: it'd be good to know if SqlVector will be added to this list of types which can appear in a CLR UDT.
Beta Was this translation helpful? Give feedback.
All reactions
-
- Good points about the type name.
- It would be problematic to have a SqlVector which means "vector of floats", as opposed to
SqlVector<int>which means "vector of ints". - On another note, binary embeddings (where the vector element is a single bit) are also a popular choice. The representation we're aligning on in .NET for this is the BitArray type. This wouldn't fit into a
SqlVector<T>, in the sense that we don't want to have aSqlVector<bool>wrapping abool[](orReadOnlyMemory<bool>), but rather a SqlBinaryVector which wraps a BitArray. - We've had very similar discussions around the .NET Embedding type hierarchy in Microsoft.Extensions.AI (which is very closely related). We ended up having an abstract Embedding superclass, with
Embedding<T>extending it and wrappingReadOnlyMemory<T>, as well as BinaryEmbedding also extending it, and wrapping BitArray. - In SqlClient we could do a similar thing, i.e. have an abstract SqlVector and a
SqlVector<T>(for float, Half...) andSqlBinaryVectorfor BitArray. Though I'm not sure that in the SqlClient case the abstract base class is actually necessary. - So for now I'd recommend maybe just going with
SqlVector<T>and only supportingSqlVector<float>. You can always supportSqlVector<Half>if and when that becomes supported, plus add support forSqlBinaryVectoras well. Not sure I'd have the base class right away - that can always be introduced later.
- It would be problematic to have a SqlVector which means "vector of floats", as opposed to
- Re nullability, I think I'd need a bit more context to fully understand this. I specifically am not sure how Null would be useful if it denotes a 0-dimension vector, which would presumably fail if you try to insert that into an actual vector column in SQL Server (which has non-zero dimensions).
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
-
SqlVector's naming sounds sensible.
I think SqlVector<float>.Null is best treated as a strongly-typed DBNull.Value - just like SqlChars, the number of dimensions shouldn't matter for a null value. In various contexts:
- Calling
SqlDataReader.GetFieldValue<SqlVector<float>>()will returnSqlVector<float>.Nullif the SQL Server value isnull, irrespective of the number of dimensions in thevector. Someone who wants to know the length of the data type should callSqlDataReader.GetColumnSchemato discover the result set schema. - Passing a null
vector(3)as a parameter would mean settingSqlParameter.ValuetoSqlVector<float>.NullandSqlParameter.Sizeto the number of dimensions, just as we'd do for a nullchar(3).
To feed those semantics through to the API proposal, I'd suggest removing the SqlVector(int) constructor and maintaining the convention that the default parameterless constructor creates a SqlVector<float> instance representing a null value.
Beta Was this translation helpful? Give feedback.
All reactions
-
Thanks for the discussion - I've been looking forward to seeing how vector support pans out in SqlClient. I can't speak much about the EF side, but have a few thoughts.
Top-level type definition
I'd suggest sealing the class by default. We can always unseal it later, and it means we avoid virtual method invocations.
I'm not sure what the definition of ISqlVector is - I imagine it's internal.
Do you have any opinions on implementing IEnumerable<float>?
Round-tripping
SQL Server returns a vector as JSON to an unenlightened client, and I assume this is what ToString returns. Since that's a canonical format, it feels natural to have Parse and TryParse methods, and for the return values of Parse and ToString to be able to round-trip:
public SqlVectorFloat32 Parse(string? s); public SqlVectorFloat32 Parse(ReadOnlySpan<char> s); public bool TryParse(string? s, out SqlVectorFloat32? result); public bool TryParse(ReadOnlySpan<char> s, out SqlVectorFloat32? result);
On a slight tangent: I assume that vector probably uses the invariant culture when parsing and outputting strings - is that correct, or would a SQL Server with a different decimal separator use that separator?
Additional uses
SqlParameter, SqlDataReader and SqlBulkCopy cover most of the uses (although SqlDataReader will also need to use this with GetFieldValue). I assume that this'll automatically result in being supported by SqlCommand.ExecuteScalar.
Is vector supported in user-defined table types? If so, it'll also need support in SqlDataRecord, perhaps by adding GetSqlVectorFloat32 and SetSqlVectorFloat32 methods to that class.
public virtual SqlVectorFloat32 GetSqlVectorFloat32(int ordinal); public virtual void SetSqlVectorFloat32(int ordinal, SqlVectorFloat32 value);
Interoperability
.NET already has a Vector<T> type, which has a lot of intrinsics associated with it. It'd be good to explicitly include this as a test case to make sure that we're able to interoperate with it. System.Numerics.Vectors is part of .NET Core, and the net462 project already holds a transitive reference to the assembly via System.Memory, so perhaps in the future we could integrate more tightly and return a Vector<float> directly.
The copy constructor accepts a float[]. What do you think about making this accept a ReadOnlySpan<float> instead, to open a zero-copy path from Vector<float> into SqlClient? The same "zero-copy" principle might also apply to outputting data too - adding an AsSpan method would allow this.
public SqlVectorFloat32(params ReadOnlySpan<float> values); public ReadOnlySpan<float> AsSpan();
Collection expressions
Although this is purely cosmetic, we could potentially use a collection expression to build the type, as below. This needs the type to implement IEnumerable.
[CollectionBuilder(typeof(SqlVectorFloat32), nameof(FromSpan))] public class SqlVectorFloat32 : INullable, IEnumerable<float> { [EditorBrowsable(EditorBrowsableState.Never)] public static SqlVectorFloat32 FromSpan(ReadOnlySpan<float> values) => new SqlVectorFloat32(values); } SqlVectorFloat32 vectorValue = [0.03f, 0.1f, 0.8f]; SqlDataRecord record = new(/*...*/); record.SetSqlVectorFloat32(0, [0.03f, 0.1f, 0.8f]);
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
-
Great comments here @edwardneal.
Some thoughts:
- Re
Vector<T>: AFAIK in .NET,Vector<T>is specifically about CPU vectorization (SIMD). The SQL Server vector type will generally be used to represent embeddings, which are generally an opaque series of numbers that users don't perform operations on. So I'm not sure it's going to be relevant to interoperate directly between SqlClient and .NETVector<T>. However, as long as we allow efficient extraction of the array (orReadOnlyMemory<float>) wrapped by the new proposed vector type, users can easily get that out and then construct aVector<T>over it, if they need to. - Re accepting
ReadOnlySpan<T>:- This relates to the question of whether SqlVectorFloat32 is a simple wrapper across the value provided in the constructor (e.g. the
float[]), or whether it copies that value into an internal array. For performance reasons, it's IMHO very important to not force expensive copying - I think SqlVectorFloat32 should be a pure, thin wrapper over data and not copy. I'd also not have both wrapping and copying constructors, as this would lead to confusion as to which one does what: I'd lean towards a very simple type which only wraps and never copies, if possible. - A constructor accepting a Span (or a FromSpan) method would have to copy; SqlVectorFloat32 cannot wrap a Span, since Spans are ref structs (and SqlVectorFloat32 would have to be a ref struct itself in order to wrap it, which is impossible as it must be assignable to SqlParameter.Value).
- However, I do believe SqlVectorFloat32 should accept a
ReadOnlyMemory<float>(see my comment above).
- This relates to the question of whether SqlVectorFloat32 is a simple wrapper across the value provided in the constructor (e.g. the
- Re collection expressions: because of how vectors (embeddings) actually work, users generally never construct them in code, but rather generate them from natural language text, images, etc. In other words, a SqlVectorFloat32 would be created by an embedding generator, e.g. an OpenAI service. So while I'm a big fan of collection expressions, I simply don't think these would be useful in any actual real-world scenario.
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
-
Thanks @roji. I'm approaching the datatype with the view that although it's mainly intended for AI usage, at its heart vector is a way to efficiently store, insert and retrieve a fixed-size set of floats. That's interesting in its own right, and it'd be a shame if the API shape closed off options.
Vector<T>: you're right that this is about enabling SIMD. I'm interested in making sure that this works well in both directions: we should be able to useVector<float>to efficiently work with an array of floats and useSqlVector<float>to pass that to SQL Server without incurring any array copies; similarly, we should be able to query the database for these, read theSqlVector<float>and work with the results without having to convert them. Exposing the underlying data as aReadOnlyMemory<float>is a good place to start for this.ReadOnlySpan<T>: yes, agreed. If we're going to say thatSqlVector<float>only ever wraps an array, usingReadOnlyMemory<float>works perfectly well.- Collection expressions: I'm not convinced that the type will only ever be used for embeddings, but we can always revisit the idea once
vector's being used more widely and we know whether it'd actually be useful.
Beta Was this translation helpful? Give feedback.
All reactions
-
Re using SQL Server vector and SqlClient's SqlVector for things other than embeddings (this is related to both interop with .NET Vector<T> for SIMD, and for enabling collection expressions)... I think the key points here are:
vectorsupports no operations - on the SQL Server side - aside from similarity search. You can't add, multiply, and, or even index to get a specific float value inside the vector. It's a completely opaque blob.vectorsupports floats only. Other database support types like Half, bit, and sparse vector representations, but no database supports e.g. int, long, double. In other words, the actual vector types follow what is useful for the narrow purpose of representing meaning (embeddings).
The above, by the way, isn't some SQL Server limitation - vector types in all vector databases are similarly limited, since they are generally meant to serve a very narrow purpose: these are totally opaque representations of meaning which aren't meant to be manipulated, but only searched for similarity. Note that in PostgreSQL, the pgvector extension adds new, specialized vector types, rather than reusing the existing, general full-featured array.
Of course, things may change and the world might start using vector databases (and types) in a different way. But at least at the moment, I believe we should follow the actual database support for the vector type, which is to treat it as opaque embeddings. Interoperability with Vector<T> specifically should already be OK as long as we take care to introduce a simple wrapper that doesn't copy (at that point the user can extract the array from SqlVector and use it with Vector<T> - no need for any explicit/specific integrations in SqlClient).
And as you say, we can always add additional things (e.g. collection expressions) as the need arises.
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1 -
🚀 1
-
@roji @edwardneal Thank you for your valuable inputs. Really appreciate it! We should be able to incorporate some of your suggestions. I will discuss the feedback internally within team and update this discussion thread.
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 3
-
Thank you everyone for your thoughtful and detailed feedback — especially @roji and @edwardneal — it’s greatly appreciated and has helped clarify several important design aspects as we move forward with vector support in SqlClient.
Let me take a moment to address the key points raised:
Type Design and Naming
We’ve opted to introduce a concrete class, SqlVectorFloat32, instead of a generic SqlVector type for the initial implementation. This decision is primarily based on:
Alignment with existing SqlType classes, which use concrete type names (e.g., SqlString, SqlBytes) for clarity and simplicity.
Avoiding the complexity and runtime overhead of validating permitted generic types (float, Half, etc.) if we had gone with SqlVector — which would have required additional internal constraints or runtime type checks.
Future extensibility: We may consider introducing an internal or abstract base (e.g., SqlVector or a non-generic SqlVector) to share common behavior across different vector representations (e.g., SqlBinaryVector for bit arrays). However, the public API surface will favor concrete types for better discoverability and alignment with ADO.NET conventions.
We’ll also be sealing the class as suggested, to avoid unnecessary virtual dispatch and better control the lifecycle of this early type.
Null Handling
While we support DBNull.Value for null values (and this integrates cleanly with ADO.NET and EF Core expectations), we still see value in keeping the constructor new SqlVectorFloat32(int dimension).
This allows users to:
Define parameter metadata explicitly — e.g., when setting up an output parameter or schema-based usage, where dimensionality is known but the actual values are not.
Preserve structural information even when a value is not assigned — in line with SQL Server’s fixed-dimension vector semantics.
That said, we acknowledge the ambiguity of this API and will work to ensure documentation makes it clear that this constructor produces a null instance with schema metadata.
Memory Ownership and Copying
We fully agree with the principle of minimizing or eliminating data copies, and we are mindful of the need for high-performance scenarios — especially when working with large embeddings.
Currently, SqlVectorFloat32 maintains a single internal representation of the data, and no redundant copies are created. However, to support zero-copy scenarios across all usages (including serialization via TDS), we anticipate needing significant updates to TdsParser.
This is not part of the initial implementation, but is definitely on our radar for follow-up iterations — particularly as we look at optimizing interop and serialization.
SqlDataReader Behavior
The proposed method GetSqlVectorFloat32() will return a SqlVectorFloat32 object that preserves metadata such as vector length, even in the case of null values.
At the same time:
GetValue() will continue to return DBNull.Value for nulls
GetFieldValue() will behave consistently and return the correct typed null instance
This aligns with ADO.NET expectations while allowing richer use cases where structural metadata is needed (e.g., when using output parameters or working with loosely defined schemas).
Parsing and Serialization
Regarding TryParse() / Parse():
Our current expectation is that application developers will deserialize the float[] from the JSON representation themselves, using existing APIs (System.Text.Json, JsonDocument, etc.).
Given that:
JSON is the intermediate format for SQL Server <-> client in certain fallback scenarios
.NET already has excellent support for JSON
This format is typically only encountered in low-level or unenlightened client scenarios
We are not currently planning to expose Parse() or TryParse() methods on SqlVectorFloat32. That said, if real-world use cases surface, we can absolutely revisit this.
Thanks Again
Thanks once again for the insightful input and candid feedback. We're committed to making the vector experience in SqlClient both intuitive and performant, and your suggestions are helping shape that foundation.
Beta Was this translation helpful? Give feedback.
All reactions
-
🚀 1
-
However, I'd advise tightening things up so that this property can only be set for NULL values, and otherwise simply exposes ROM.Length. In other words, in the above proposal Length is settable by the user (it's an Init property). This means that a user can create a SqlVector with mismatching Length and Memory properties, which ideally wouldn't be possible. So I'd advise making Length a read-only property, and only allowing the user to determine it when they construct a null SqlVector instance (via the constrcutor). The getter would always simply return ROM.Length, unless the instance represents a null, in which case it would return what the user passed in the constructor.
Yes, the current implementation allows Length to be set in the constructor only if the user passes it with
public SqlVector(int length);
in which case it represents a null instance of a given length and the only other constructor
public SqlVector(ReadOnlyMemory<T> memory);
initializes Length to ROM.Length after which it is immutable.
I agree with your suggestion to turn SqlVector(length) into factory method CreateNull to make it more clear.
Regarding
public int Size { get; init; }
Users are not required to interact with SqlVector.Size for any use case other than using Prepare API.
SqlVector.Size is needed because user won't know what they should set the SqlParameter.Size to in case of vector parameters when used in the context of Prepare API.
SqlVector.Size provides the correct value to be used.
Beta Was this translation helpful? Give feedback.
All reactions
-
OK. Re Length:
public int Length { get; init; } // The init here seems like it should be removed public ReadOnlyMemory<T> Memory { get; init; }
Everything you wrote makes sense - I'm only point out that in the last state of the proposal, the Length property is initable. This means that the user can instantiate a SqlVector with a non-null ReadOnlyMemory, and also set its Length property to something which does not match the ReadOnlyMemory:
var vector = new SqlVector<float>([1.0f, 2.0f, 3.0f]) { Length = 4 };
This seems wrong, since a SqlVector wrapping a ROM with a length of 3 has its Length property set to 4. So I'm suggesting to remove the init; from the Length property, making it read-only.
Re Size:
SqlVector.Size is needed because user won't know what they should set the SqlParameter.Size to in case of vector parameters when used in the context of Prepare API.
SqlVector.Size provides the correct value to be used.
I see... So to prepare a statement which will send vector parameters of size 4, users are expected to do the following:
var command = new SqlCommand(...); var parameter = new SqlParameter("vector", SqlDbType.Vector); var dummyVector = SqlVector.CreateNull(4); parameter.Size = dummyVector.Size; // dummyVector will no longer be needed - was only used to calculate the byte Size for a vector of size 4 command.Parameters.Add(parameter); command.Prepare();
Is that correct?
If so, may I ask why it was decided that for the vector type, SqlParameter.Size needs to represent the byte length as opposed to the dimension? The byte length of a vector is an internal, TDS-specific thing that users obviously know nothing about, which is why you have to provide some API - the Size property - in order for them to get it. On the other hand, it seems to make more sense for users to just provide the dimensions directly, rather than having to convert the dimensions to the byte length.
I'll point out the SqlParameter.Size already doesn't always represent the byte length; for string type parameters, Size means length in Unicode characters and not in bytes (docs). So I'm again not sure why we'd force the user to deal with the byte length in the vector case.
If there's still some reason why SqlParameter.Size must absolutely mean the vector byte length, then I'd suggest simply providing a simple static method (static SqlVector<T>.GetByteLength(int dimensions)) rather than having a property for that on SqlVector. Having a real property actually takes space which will never actually be used, is bound to create confusion between Length and Size, etc. (it's also not clear why users would want to set it - it's initable as well). A simple static method seems more than sufficient (though again, the simplest/easiest is for users to just set the dimensions directly on SqlParameter.Size).
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
-
This seems wrong, since a SqlVector wrapping a ROM with a length of 3 has its Length property set to 4. So I'm suggesting to remove the init; from the Length property, making it read-only.
You're right. I'll fix this.
If so, may I ask why it was decided that for the vector type, SqlParameter.Size needs to represent the byte length as opposed to the dimension? The byte length of a vector is an internal, TDS-specific thing that users obviously know nothing about, which is why you have to provide some API - the Size property - in order for them to get it. On the other hand, it seems to make more sense for users to just provide the dimensions directly, rather than having to convert the dimensions to the byte length.
Issue with interpreting SqlParameter.Size as number of elements in vector is that it doesn't tell us the type of the element it is unless we define multiple SqlDbTypes for every vector type such as Float32, Float16, Int32 etc. We need to know the type as well in order to calculate the correct size in bytes for transmitting the data on TDS.
Agree with your suggestion on changing the property to GetByteLength
Beta Was this translation helpful? Give feedback.
All reactions
-
unless we define multiple SqlDbTypes for every vector type such as Float32, Float16, Int32
Yes, I understand; it's my strong assumption that if/when other vector types are introduced in SQL Server in the future, those would have their own SqlDbType entries as well (regardless of preparing statements, I think you need a SqlDbType entry for each actual SQL Server type). Or if SQL Server itself introduces some other new facet type to represent Float32/Float16, you'd need to add that to SqlParameter next to Size, Precision and Scale. But I think that's really unlikely - just introducing a couple new types like halfvector, binaryvector would be my assumption for how things develop (you can always check with SQL Server folks if that assumption makes sense).
So to summarize, I'd indeed expect SqlDbType to be sufficient to identify the specific vector types (including float32 vs. float16), and therefore the combination of SqlDbType + Size (as dimensions) should be sufficient.
Beta Was this translation helpful? Give feedback.
All reactions
-
Here is the revised API for vector datatype
namespace Microsoft.Data.SqlTypes { /// <summary> /// Represents a SQL vector of unmanaged type elements (currently supports float32 and more will be added later). /// </summary> public readonly struct SqlVector<T> : System.Data.SqlTypes.INullable where T : unmanaged { /// <summary> /// Creates a vector with the specified elements. /// </summary> public SqlVector(System.ReadOnlyMemory<T> memory) { } /// <summary> /// Indicates whether this vector instance is null. /// </summary> public bool IsNull { get; } /// <summary> /// Required for maintaining the compatibility with DataTable. Points to a null value. /// </summary> public static SqlVector<T>? Null { get; } /// <summary> /// Gets the number of elements in this vector. /// </summary> public int Length { get; } /// <summary> /// Gets the vector elements as read-only memory. /// </summary> public ReadOnlyMemory<T> Memory { get; } /// <summary> /// Creates a null instance of a vector with the specified length. /// </summary> public static SqlVector<T> CreateNull(int length); } }
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
-
Although we've got SqlVector<float> support, it looks like SQL Server 2025 is also releasing half-precision vectors, with binary transport support not yet available.
In .NET, might this require support for SqlVector<Half>?
Beta Was this translation helpful? Give feedback.
All reactions
-
That's correct. We are scoping the work for .NET and plan to use System.Half. For .NET Framework, there is no half-precision floating point type, so we're not planning to add support there.
Beta Was this translation helpful? Give feedback.
All reactions
-
🚀 1