I’m writing an app that will query using customer ID with the intent that it will pull 1 customer data per time. The search criteria is the customer ID which is also the primary key; the result will pull one row. I plan on having just the 1 database with 1 table with text data only. My intent is to have 100,000 rows and 20 columns, each database field containing characters of a maximum of 50.
I’m hoping someone can point me in the right direction regarding sizing for this. I know it won’t be accurate but I’m looking for a rough estimate and guidance on how to go about sizing the database. Any guides or reference sites would be much appreciated.
-
1who cares? What's the smallest hard drive you can buy retail these days, 1 TB?Neil McGuigan– Neil McGuigan2017年10月10日 18:34:32 +00:00Commented Oct 10, 2017 at 18:34
-
That is a small database for Postgres and today's hardware. Is there some reason you are concerned about size?Basil Bourque– Basil Bourque2017年10月11日 06:04:13 +00:00Commented Oct 11, 2017 at 6:04
1 Answer 1
Don't worry about it.
- 100,000 rows is nothing.
- 20 columns is nothing.
You're at a totally reasonable workload to not concern yourself with this. Work on spending your time doing something more productive. We would have to know the location of
- all the columns
- all the types of the columns
- all the inputs on the columns, and whether those specific columns were TOASTable.
Just work on features, come back when something is too slow or doesn't work.
If you want the thereabouts (napkin math). Figure for a row of m
columns.
1*m
(varlena/text overhead)50*m
(assuming ascii for the contents)
With the row,
- 23 bytes for the row overhead
- 8 for padding (worst case)
And the db.
- rounded to 8kb for the disk page.
So something like
SELECT (( (50+1)*20 + 23 + 8) * 1e5) / 1e6; -- 1e5 rows, 1e6 bytes in a mB
So that's 105.1 MB.
In the grand scheme of things, that rounds to 0.
Verification
CREATE TABLE foo (
c10 text DEFAULT repeat('x', 50),
c11 text DEFAULT repeat('x', 50),
c12 text DEFAULT repeat('x', 50),
c13 text DEFAULT repeat('x', 50),
c14 text DEFAULT repeat('x', 50),
c15 text DEFAULT repeat('x', 50),
c16 text DEFAULT repeat('x', 50),
c17 text DEFAULT repeat('x', 50),
c18 text DEFAULT repeat('x', 50),
c19 text DEFAULT repeat('x', 50),
c20 text DEFAULT repeat('x', 50),
c21 text DEFAULT repeat('x', 50),
c22 text DEFAULT repeat('x', 50),
c23 text DEFAULT repeat('x', 50),
c24 text DEFAULT repeat('x', 50),
c25 text DEFAULT repeat('x', 50),
c26 text DEFAULT repeat('x', 50),
c27 text DEFAULT repeat('x', 50),
c28 text DEFAULT repeat('x', 50),
c29 text DEFAULT repeat('x', 50)
);
INSERT INTO foo(c10)
SELECT repeat('x',50)
FROM generate_series(1,1e5);
test=# \dt+ foo;
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------+-------+----------+--------+-------------
public | foo | table | ecarroll | 112 MB |
-
I am glad my napkin math was close to yours.Anthony Genovese– Anthony Genovese2017年10月09日 23:16:30 +00:00Commented Oct 9, 2017 at 23:16