-
Notifications
You must be signed in to change notification settings - Fork 638
Limit parameter does not work in paginateScan and paginateQuery in @aws-sdk/lib-dynamodb #5967
-
Checkboxes for prior research
- I've gone through Developer Guide and API reference
- I've checked AWS Forums and StackOverflow.
- I've searched for previous similar issues and didn't find any solution.
Describe the bug
It is possible to specify limit parameters in paginateScan and paginateQuery in @aws-sdk/lib-dynamodb, but it makes no sense. All data from the table is queried or scanned regardless of the Limit value you specify.
SDK version number
@aws-sdk/lib-dynamodb@3.540.0
Which JavaScript Runtime is this issue in?
Node.js
Details of the browser/Node.js/ReactNative version
Node.js 20.x
Reproduction Steps
This is the code that actually causes the problem.
// lib/cdk-sample-stack.sampleFunc.ts import { paginateScan, DynamoDBDocument } from '@aws-sdk/lib-dynamodb'; import { DynamoDBClient } from '@aws-sdk/client-dynamodb'; const SAMPLE_TABLE_NAME = process.env.SAMPLE_TABLE_NAME || ''; interface DataItem { id: string; timestamp: number; } const ddbDocClient = DynamoDBDocument.from( new DynamoDBClient({ region: 'ap-northeast-1', apiVersion: '2012-08-10', }) ); export const handler = async (): Promise<void> => { const paginator = paginateScan( { client: ddbDocClient, }, { TableName: SAMPLE_TABLE_NAME, Limit: 5, // DOES NOT WORK } ); const items: DataItem[] = []; for await (const page of paginator) { console.log(page.Count); items.push(...(page.Items as DataItem[])); } console.log(items); };
This is the sample data file src/tableData/table1.csv stored in the table.
d001,1700625273
d001,1699658818
d001,1703858878
d001,1681316462
d001,1695108297
d001,1694674832
d001,1680945699
d001,1701799579
d001,1696271173
d001,1685651084
d002,1706301230
d002,1679314750
d002,1701457171
d002,1685919651
d002,1684091128
Create the Lambad function and DynamoDB table that causes the event using AWS CDK.
// lib/cdk-sample-stack.ts import { aws_lambda, aws_logs, aws_lambda_nodejs, aws_dynamodb, aws_s3, aws_s3_deployment, Stack, RemovalPolicy, CfnOutput, } from 'aws-cdk-lib'; import { Construct } from 'constructs'; export class CdkSampleStack extends Stack { constructor(scope: Construct, id: string) { super(scope, id); const bucket = new aws_s3.Bucket(this, 'Bucket', { removalPolicy: RemovalPolicy.DESTROY, autoDeleteObjects: true, }); new aws_s3_deployment.BucketDeployment(this, 'DeploySampleTableData', { sources: [aws_s3_deployment.Source.asset('./src/tableData')], destinationBucket: bucket, }); const sampleTable = new aws_dynamodb.Table(this, 'SampleTable', { partitionKey: { name: 'id', type: aws_dynamodb.AttributeType.STRING }, sortKey: { name: 'timestamp', type: aws_dynamodb.AttributeType.NUMBER }, billingMode: aws_dynamodb.BillingMode.PAY_PER_REQUEST, removalPolicy: RemovalPolicy.DESTROY, importSource: { inputFormat: aws_dynamodb.InputFormat.csv({ delimiter: ',', headerList: ['id', 'timestamp'], }), bucket, }, }); const sampleFunc = new aws_lambda_nodejs.NodejsFunction( this, 'SampleFunc', { architecture: aws_lambda.Architecture.ARM_64, runtime: aws_lambda.Runtime.NODEJS_20_X, logGroup: new aws_logs.LogGroup(this, 'SampleFuncLogGroup', { removalPolicy: RemovalPolicy.DESTROY, }), environment: { SAMPLE_TABLE_NAME: sampleTable.tableName, }, } ); sampleTable.grantReadData(sampleFunc); new CfnOutput(this, 'SampleFuncName', { value: sampleFunc.functionName, }); } }
Deploy resources with CDK commands.
npx cdk deploy --require-approval never --method=direct
Observed Behavior
When I run the Lambda function built above, all data on the table will be retrieved and output to the logs, regardless of the value specified for the Limit parameter of paginateScan.
15 [ { id: 'd001', timestamp: 1680945699 }, { id: 'd001', timestamp: 1681316462 }, { id: 'd001', timestamp: 1685651084 }, { id: 'd001', timestamp: 1694674832 }, { id: 'd001', timestamp: 1695108297 }, { id: 'd001', timestamp: 1696271173 }, { id: 'd001', timestamp: 1699658818 }, { id: 'd001', timestamp: 1700625273 }, { id: 'd001', timestamp: 1701799579 }, { id: 'd001', timestamp: 1703858878 }, { id: 'd002', timestamp: 1679314750 }, { id: 'd002', timestamp: 1684091128 }, { id: 'd002', timestamp: 1685919651 }, { id: 'd002', timestamp: 1701457171 }, { id: 'd002', timestamp: 1706301230 } ]
Expected Behavior
The maximum number of data specified by the Limit parameter is returned.
Possible Solution
No response
Additional Information/Context
No response
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 2 comments 1 reply
-
Hi @cm-rwakatsuki ,
When you are using a paginator if you want to specify the max results per page, you need to use the paginator config's pageSize and not Limit from the request parameter as it used for non pagination requests.
const paginator = paginateScan(
{
client: ddbDocClient,
+ pageSize: 1 // functions similarly to Limit
},
{
TableName: SAMPLE_TABLE_NAME,
- Limit: 1, // DOES NOT WORK
}By specifying the pageSize parameter we can see the results more accurately printed one result per page:
Received page with 1 items
[ { id: 'd001', timestamp: 1685651084 } ]
Received page with 1 items
[ { id: 'd002', timestamp: 1684091128 } ]
Received page with 1 items
[ { id: '123', name: 'Test Item' } ]
Received page with 0 items
[]
If you want to limit the total returned results you need to specify the number of pages * the number of results per page = total results returned:
async function scanTableWithLimit() { const ddbDocClient = DynamoDBDocumentClient.from(client); const paginator = paginateScan( { client: ddbDocClient, pageSize: 1, }, { TableName: tableName, } ); const LIMIT = 2; let count = 0; for await (const page of paginator) { console.log(`Received page with ${page.Items.length} items`); console.log(page.Items); if (++count >= LIMIT) { break; } } }
Will result in a total of 2 results returned (2 pages, 1 result per page):
Received page with 1 items
[ { id: 'd001', timestamp: 1685651084 } ]
Received page with 1 items
[ { id: 'd002', timestamp: 1684091128 } ]
====
Non pagination request with Limit:
async function scanTableWithLimit() { try { const response = await client.send(new ScanCommand({ TableName: tableName, Limit: 1 })); console.log(response.Items); } catch (error) { console.error(error); } }
This will indeed return only 1 result:
[ { id: { S: 'd001' }, timestamp: { N: '1685651084' } } ]
I hope this clarifies things.
All the best,
Ran~
Beta Was this translation helpful? Give feedback.
All reactions
-
@RanVaknin
Thank you for your detailed answer! From now on, we will use pageSize.
Also, it seems better to modify the implementation of the arguments of paginateScan and paginateQuery by removing the Limit parameter.
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
-
@RanVaknin I too faced similar issue. But here, I expect to get 25 results for every request I send to DynamoDb. If I use pageSize as 1 will it affect the RCU and increase costs? It wasn't the case with https://github.com/aws/aws-sdk-js-v3/blob/v3.431.0/clients/client-dynamodb/src/pagination/QueryPaginator.ts version. I was able to get results as expected. Any recommendations?
Beta Was this translation helpful? Give feedback.