Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Limit parameter does not work in paginateScan and paginateQuery in @aws-sdk/lib-dynamodb #5967

cm-rwakatsuki started this conversation in General
Discussion options

Checkboxes for prior research

Describe the bug

It is possible to specify limit parameters in paginateScan and paginateQuery in @aws-sdk/lib-dynamodb, but it makes no sense. All data from the table is queried or scanned regardless of the Limit value you specify.

SDK version number

@aws-sdk/lib-dynamodb@3.540.0

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

Node.js 20.x

Reproduction Steps

This is the code that actually causes the problem.

// lib/cdk-sample-stack.sampleFunc.ts
import { paginateScan, DynamoDBDocument } from '@aws-sdk/lib-dynamodb';
import { DynamoDBClient } from '@aws-sdk/client-dynamodb';
const SAMPLE_TABLE_NAME =
 process.env.SAMPLE_TABLE_NAME || '';
interface DataItem {
 id: string;
 timestamp: number;
}
const ddbDocClient = DynamoDBDocument.from(
 new DynamoDBClient({
 region: 'ap-northeast-1',
 apiVersion: '2012-08-10',
 })
);
export const handler = async (): Promise<void> => {
 const paginator = paginateScan(
 {
 client: ddbDocClient,
 },
 {
 TableName: SAMPLE_TABLE_NAME,
 Limit: 5, // DOES NOT WORK
 }
 );
 const items: DataItem[] = [];
 for await (const page of paginator) {
 console.log(page.Count);
 items.push(...(page.Items as DataItem[]));
 }
 console.log(items);
};

This is the sample data file src/tableData/table1.csv stored in the table.

d001,1700625273
d001,1699658818
d001,1703858878
d001,1681316462
d001,1695108297
d001,1694674832
d001,1680945699
d001,1701799579
d001,1696271173
d001,1685651084
d002,1706301230
d002,1679314750
d002,1701457171
d002,1685919651
d002,1684091128

Create the Lambad function and DynamoDB table that causes the event using AWS CDK.

// lib/cdk-sample-stack.ts
import {
 aws_lambda,
 aws_logs,
 aws_lambda_nodejs,
 aws_dynamodb,
 aws_s3,
 aws_s3_deployment,
 Stack,
 RemovalPolicy,
 CfnOutput,
} from 'aws-cdk-lib';
import { Construct } from 'constructs';
export class CdkSampleStack extends Stack {
 constructor(scope: Construct, id: string) {
 super(scope, id);
 const bucket = new aws_s3.Bucket(this, 'Bucket', {
 removalPolicy: RemovalPolicy.DESTROY,
 autoDeleteObjects: true,
 });
 new aws_s3_deployment.BucketDeployment(this, 'DeploySampleTableData', {
 sources: [aws_s3_deployment.Source.asset('./src/tableData')],
 destinationBucket: bucket,
 });
 const sampleTable = new aws_dynamodb.Table(this, 'SampleTable', {
 partitionKey: { name: 'id', type: aws_dynamodb.AttributeType.STRING },
 sortKey: { name: 'timestamp', type: aws_dynamodb.AttributeType.NUMBER },
 billingMode: aws_dynamodb.BillingMode.PAY_PER_REQUEST,
 removalPolicy: RemovalPolicy.DESTROY,
 importSource: {
 inputFormat: aws_dynamodb.InputFormat.csv({
 delimiter: ',',
 headerList: ['id', 'timestamp'],
 }),
 bucket,
 },
 });
 const sampleFunc = new aws_lambda_nodejs.NodejsFunction(
 this,
 'SampleFunc',
 {
 architecture: aws_lambda.Architecture.ARM_64,
 runtime: aws_lambda.Runtime.NODEJS_20_X,
 logGroup: new aws_logs.LogGroup(this, 'SampleFuncLogGroup', {
 removalPolicy: RemovalPolicy.DESTROY,
 }),
 environment: {
 SAMPLE_TABLE_NAME: sampleTable.tableName,
 },
 }
 );
 sampleTable.grantReadData(sampleFunc);
 new CfnOutput(this, 'SampleFuncName', {
 value: sampleFunc.functionName,
 });
 }
}

Deploy resources with CDK commands.

npx cdk deploy --require-approval never --method=direct

Observed Behavior

When I run the Lambda function built above, all data on the table will be retrieved and output to the logs, regardless of the value specified for the Limit parameter of paginateScan.

15
[
 { id: 'd001', timestamp: 1680945699 },
 { id: 'd001', timestamp: 1681316462 },
 { id: 'd001', timestamp: 1685651084 },
 { id: 'd001', timestamp: 1694674832 },
 { id: 'd001', timestamp: 1695108297 },
 { id: 'd001', timestamp: 1696271173 },
 { id: 'd001', timestamp: 1699658818 },
 { id: 'd001', timestamp: 1700625273 },
 { id: 'd001', timestamp: 1701799579 },
 { id: 'd001', timestamp: 1703858878 },
 { id: 'd002', timestamp: 1679314750 },
 { id: 'd002', timestamp: 1684091128 },
 { id: 'd002', timestamp: 1685919651 },
 { id: 'd002', timestamp: 1701457171 },
 { id: 'd002', timestamp: 1706301230 }
]

Expected Behavior

The maximum number of data specified by the Limit parameter is returned.

Possible Solution

No response

Additional Information/Context

No response

You must be logged in to vote

Replies: 2 comments 1 reply

Comment options

Hi @cm-rwakatsuki ,

When you are using a paginator if you want to specify the max results per page, you need to use the paginator config's pageSize and not Limit from the request parameter as it used for non pagination requests.

 const paginator = paginateScan(
 {
 client: ddbDocClient,
+ pageSize: 1 // functions similarly to Limit
 },
 {
 TableName: SAMPLE_TABLE_NAME,
- Limit: 1, // DOES NOT WORK
 }

By specifying the pageSize parameter we can see the results more accurately printed one result per page:

Received page with 1 items
[ { id: 'd001', timestamp: 1685651084 } ]
Received page with 1 items
[ { id: 'd002', timestamp: 1684091128 } ]
Received page with 1 items
[ { id: '123', name: 'Test Item' } ]
Received page with 0 items
[]

If you want to limit the total returned results you need to specify the number of pages * the number of results per page = total results returned:

async function scanTableWithLimit() {
 const ddbDocClient = DynamoDBDocumentClient.from(client);
 const paginator = paginateScan(
 {
 client: ddbDocClient,
 pageSize: 1,
 },
 {
 TableName: tableName,
 
 }
 );
 const LIMIT = 2;
 let count = 0;
 for await (const page of paginator) {
 console.log(`Received page with ${page.Items.length} items`);
 console.log(page.Items);
 if (++count >= LIMIT) {
 break;
 }
 }
}

Will result in a total of 2 results returned (2 pages, 1 result per page):

Received page with 1 items
[ { id: 'd001', timestamp: 1685651084 } ]
Received page with 1 items
[ { id: 'd002', timestamp: 1684091128 } ]

====

Non pagination request with Limit:

async function scanTableWithLimit() {
 try {
 const response = await client.send(new ScanCommand({
 TableName: tableName,
 Limit: 1 
 }));
 console.log(response.Items);
 } catch (error) {
 console.error(error);
 }
}

This will indeed return only 1 result:

[ { id: { S: 'd001' }, timestamp: { N: '1685651084' } } ]

I hope this clarifies things.

All the best,
Ran~

You must be logged in to vote
1 reply
Comment options

@RanVaknin
Thank you for your detailed answer! From now on, we will use pageSize.
Also, it seems better to modify the implementation of the arguments of paginateScan and paginateQuery by removing the Limit parameter.

Comment options

@RanVaknin I too faced similar issue. But here, I expect to get 25 results for every request I send to DynamoDb. If I use pageSize as 1 will it affect the RCU and increase costs? It wasn't the case with https://github.com/aws/aws-sdk-js-v3/blob/v3.431.0/clients/client-dynamodb/src/pagination/QueryPaginator.ts version. I was able to get results as expected. Any recommendations?

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug.
Converted from issue

This discussion was converted from issue #5952 on April 05, 2024 16:42.

AltStyle によって変換されたページ (->オリジナル) /