3

I started working on an existing SQL Server Database System in which most of the fields are stored as text. Except some IDs all fields are varchar (phone numbers, zip codes, dates, addresses, monetary values etc.).

This is not how I learned to build a database. When asked collegues said that it is easier this way. Is it bad practice to keep all fields in varchar? How could I argue that it should be changed?

Michael Green
25.3k13 gold badges54 silver badges100 bronze badges
asked Oct 20, 2014 at 7:17
6
  • 2
    Why it is better, ask them. Dates field can be of Date/DateTime which will be used in times you need sorting, and handling other functions. Arithmetic calculation in sql query would be a mess, if any row has unexpected datatype. Performance issue in future. Commented Oct 20, 2014 at 7:37
  • 6
    Without the correrct data type it's a lot harder to "simulate" built-in function. For example, how would you select data with a datetime value between 24hrs ago and "now"? Of course, this can be done using implicit or explicit conversion. But this may fail due to incompatible values. Why not doing it right the first time and let the database handle datatype integrity? Commented Oct 20, 2014 at 8:07
  • 6
    Data validation is the most important reason for choosing the correct data type. Try to prevent storing an invalid date like 2014-02-29 into a varchar column. You can also rely on a column of type integer to never have a value of foobar in it. Commented Oct 20, 2014 at 12:07
  • It's worth pointing out that phone numbers and Zip codes should be character data types. Commented Oct 20, 2014 at 12:45
  • sqlblog.org/2009/10/12/… sqlblog.org/2009/10/16/… Commented Oct 20, 2014 at 13:04

6 Answers 6

8

Size is one consideration. An int can hold up to -2,147,483,648 in four bytes. A char will need 11 bytes to hold the same value.

There are built-in functions to manipulate the various data types. DATEADD() and DATEDIFF() are two examples. This will not be possible with date-stored-as-text. Constantly CASTing back and forth will not make for efficient processing, or legible code.

Automatic validation is another foregone benefit with the all-text approach. You may think that a column contains dates but there will be nothing to stop someone entering the value '2014-13-97'.

Sorting is unlikely to give the intended result with columns which are "really" numbers. For example, if a column contained integers '1' through '100', and the query sorted by this column, one would expect the result to be

1, 2 .. 10, 11 .. 20, 21 .. 100

However, the actual result is more likely to be

1, 10, 100, 11, 12, 2, 20, 21 ...

There may be similar concerns for dates, depending on the chosen character representation. Again type casting can cure this with the costs mentioned previously.

Occasionally you will find strings which contain digits only. Examples are national identity numbers, bank account numbers, phone numbers and such like. The common characteristic of such values is that it does not make sense to perform calculations on them. Other things being equal it may be OK to store these as char() columns, especially if they have optional embedded alpha characters, but the above considerations still apply.

answered Oct 20, 2014 at 9:47
5

Your colleague is correct that it is easier to simply not think about it and just store everything as a varchar. But this comes at a large cost in terms of space requirements, performance, flexibility in querying data, and most importantly, lack of data integrity. This is not just a one-time cost; it is paid repeatedly over the lifecycle of the application.

I suggest one follow the best practice and store data using the data type most appropriate for the domain of data stored, as you learned.

answered Oct 20, 2014 at 13:39
4

There are several situations in which it's better to represent numbers using some kind of numeric data-type. It's a little more efficient, but that's just the beginning. You get support for built-in arithmetic using SQL operators without performing type conversions at run time. Not only do type conversions slow things down, but they can result in numerous run time errors, and catching these errors requires added programming effort.

In general, it's more worthwhile to filter out bad data on the way in that to discover the error at retrieve time.

It's often worthwhile to distinguish between a number, such as an annual salary, and a numeric character string, such as a telephone number. Storing numeric character strings in VARCHAR containers is often good practice.

answered Oct 20, 2014 at 11:46
2

Zip codes should be stored as text, as some start with 0 (screwing up formatting/sorting) and there is no reason to do math on them. Also, if you want to store global postal codes, they often contain letters.

Phone numbers are a maybe for text, especially if there might be extensions. Or you want to store numbers like 1-800-GOT-JUNK. There's no reason to do mathematical operations on phone numbers, so text is probably fine.

Storing dates and money amounts as text is dumb and allows for invalid values and prevents operations like addition and subtraction.

ONE good reason for a table with all text fields is for CSV import, but this should be temp tables only.

answered Oct 20, 2014 at 17:15
0

There are different data types in sql to store the data. if the all data are same then varchar is good to store.

but in the future, you need the operation on the data, you can't because of limited functionality.

better to go with the data type of data.

answered Oct 20, 2014 at 8:00
0

I have spent most of my long career with pick/multi-value databases for programming complex business systems. They are very simple and I have never seen a business which is too large or complex for this type of database. A record is string of tab delimited text, no matter what the data type. In the programming language the text is converted into either a Date, Time, Number or left as text. The query language also recognizes these 4 basic types of data. When I create a relational database I use only varchar, int, date and possibly time or datetime.

answered Mar 12, 2016 at 16:41

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.