-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DynamoDB events have milliseconds, not seconds, in their timestamps which crashes deserialization #839
Comments
Hi @cdegroot, Good afternoon. Looking at the code, looks like it is an automated unit test code where your test data is real record from DynamoDB Streams. Could you please share the source for the JSON string? In you JSON string, the value for Thanks, |
I created that test based on real data I saw flowing so I could reproduce (and work around the issue). I won't share our full data in public, but hopefully the following exposition helps. From another (JS-based) Kinesis consumer, which is logging full Kinesis events, I copied an event:
I snipped most of the payload for brevity and privacy, but base64 decoding it:
(again edited for brevity and privacy). As you can see, there's a 13 digit timestamp in there. I added three zeroes to the existing test data (which I think I snipped from here] to reproduce the issue. |
Hi @cdegroot, Good afternoon. Unfortunately I'm unable to reproduce the issue. Used the below steps:
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>netcoreapp3.1</TargetFramework>
<GenerateRuntimeConfigurationFiles>true</GenerateRuntimeConfigurationFiles>
<AWSProjectType>Lambda</AWSProjectType>
<!-- This property makes the build directory similar to a publish directory and helps the AWS .NET Lambda Mock Test Tool find project dependencies. -->
<CopyLocalLockFileAssemblies>true</CopyLocalLockFileAssemblies>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Amazon.Lambda.Core" Version="2.0.0" />
<PackageReference Include="Amazon.Lambda.DynamoDBEvents" Version="2.0.0" />
<PackageReference Include="Amazon.Lambda.Serialization.SystemTextJson" Version="2.1.0" />
</ItemGroup>
</Project> Function.cs using Amazon.Lambda.Core;
using System.IO;
// Assembly attribute to enable the Lambda function's JSON input to be converted into a .NET class.
[assembly: LambdaSerializer(typeof(Amazon.Lambda.Serialization.SystemTextJson.DefaultLambdaJsonSerializer))]
namespace DynamoDbStream_Issue839
{
public class Function
{
public void FunctionHandler(DynamoDBEvent dynamoDBEvent, ILambdaContext context)
{
context.Logger.LogLine($"DynamoDBEvent received with {dynamoDBEvent.Records.Count} records.");
foreach (var record in dynamoDBEvent.Records)
{
context.Logger.LogLine($"{record.EventID}:{record.EventName}, Dynamodb.ApproximateCreationDateTime:{record.Dynamodb.ApproximateCreationDateTime}");
}
}
}
} aws-lambda-tools-defaults.json {
"Information" : [
"This file provides default values for the deployment wizard inside Visual Studio and the AWS Lambda commands added to the .NET Core CLI.",
"To learn more about the Lambda commands with the .NET Core CLI execute the following command at the command line in the project root directory.",
"dotnet lambda help",
"All the command line options for the Lambda command can be specified in this file."
],
"profile" : "default",
"region" : "us-east-2",
"configuration" : "Release",
"framework" : "netcoreapp3.1",
"function-runtime" : "dotnetcore3.1",
"function-memory-size" : 256,
"function-timeout" : 30,
"function-handler" : "DynamoDbStream_Issue839::DynamoDbStream_Issue839.Function::FunctionHandler",
"function-name" : "StreamConsumer",
"package-type" : "Zip",
"function-role" : "arn:aws:iam::139480602983:role/lambda_exec_StreamConsumer-0",
"tracing-mode" : "PassThrough",
"environment-variables" : "",
"image-tag" : "",
"function-description" : ""
} NOTE: To keep things simple for reproduction, the role
using Amazon.Lambda.Core;
using System.IO;
// Assembly attribute to enable the Lambda function's JSON input to be converted into a .NET class.
[assembly: LambdaSerializer(typeof(Amazon.Lambda.Serialization.SystemTextJson.DefaultLambdaJsonSerializer))]
namespace DynamoDbStream_Issue839
{
public class Function
{
public Stream FunctionHandler(Stream userInput, ILambdaContext context)
{
string jsonString;
using (var reader = new StreamReader(userInput))
{
jsonString = reader.ReadToEnd();
}
context.Logger.LogLine($"JSON received: {jsonString}");
byte[] bytes = System.Text.Encoding.UTF8.GetBytes(jsonString);
MemoryStream stream = new MemoryStream(bytes);
return stream;
}
}
}
As you could see from JSON captured above, the Please let me know if this helps. Thanks, |
This issue has not recieved a response in 2 weeks. If you want to keep this issue open, please just leave a comment below and auto-close will be canceled. |
I am seeing this issue as well, and receiving the same "value out of range" exception related to the "ApproximateCreationDateTime" field during deserialization. This is only occurring when trying to processes events that are being streamed through Kinesis (DyanmoDB --> Kinesis --> Lambda). I believe that is why you couldn't reproduce it. |
I can second that analysis, especially after studying the code and what is in the event. |
@Tragetaschen It would be better to open a new issue linking it to this one. Thanks. |
Issue is reproducible using the following steps:
using Amazon.DynamoDBv2.Model;
using Amazon.Lambda.Core;
using Amazon.Lambda.KinesisEvents;
using Amazon.Lambda.Serialization.SystemTextJson;
using System;
using System.IO;
using System.Text;
// Assembly attribute to enable the Lambda function's JSON input to be converted into a .NET class.
[assembly: LambdaSerializer(typeof(Amazon.Lambda.Serialization.SystemTextJson.DefaultLambdaJsonSerializer))]
namespace DynamoDbKinesisLambdaNewTest
{
public class Function
{
public void FunctionHandler(KinesisEvent kinesisEvent, ILambdaContext context)
{
context.Logger.LogLine($"Beginning to process {kinesisEvent.Records.Count} records...");
foreach (var record in kinesisEvent.Records)
{
context.Logger.LogLine($"Event ID: {record.EventId}");
context.Logger.LogLine($"Event Name: {record.EventName}");
context.Logger.LogLine($"ApproximateArrivalTimestamp: {record.Kinesis.ApproximateArrivalTimestamp}");
string recordData = GetRecordContents(record.Kinesis);
context.Logger.LogLine($"Record Data:");
context.Logger.LogLine(recordData);
context.Logger.LogLine($"Record Data (base64):");
context.Logger.LogLine(Convert.ToBase64String(record.Kinesis.Data.ToArray()));
context.Logger.LogLine($"Deserializing data to 'Amazon.DynamoDBv2.Model.Record'");
record.Kinesis.Data.Position = 0;
var serializer = new DefaultLambdaJsonSerializer();
Record dynamoDBRecord = serializer.Deserialize<Amazon.DynamoDBv2.Model.Record>(record.Kinesis.Data);
context.Logger.LogLine($"Successfully deserialized data to 'Amazon.DynamoDBv2.Model.Record'");
}
context.Logger.LogLine("Stream processing complete.");
}
private string GetRecordContents(KinesisEvent.Record streamRecord)
{
using (var reader = new StreamReader(streamRecord.Data, Encoding.UTF8, true, -1, true))
{
return reader.ReadToEnd();
}
}
}
}
{
"date": {
"S": "2020-03-07T16:29:50+02:00"
},
"reactor": {
"S": "reactor-1"
},
"recordId": {
"S": "2"
},
"temp": {
"N": "800"
}
} RESULT:
Looks like in the DynamoDB record sent to Kinesis data stream has For testing purposes, the base64 encoded string in CloudWatch logs could be used with Kinesis example request in Mock Lambda Test tool in Visual Studio. The issue is probably related to the data sent from DynamoDB to Kinesis data stream. Related articles:
Thanks, |
Tracking the above issue internally. Would post here as updates are available. |
Shouldn't this be reopened and kept open then? |
Agreed, reopening. It does appear that the service response may vary depending on context.
I think we may be able to either try to recognize that we're within a Kinesis context, or else DateTimeConverter could be expanded to handle both seconds and milliseconds as the original issue suggested. |
Comments on closed issues are hard for our team to see. |
Description
I'm creating a lambda to process DynamoDB events streamed through Kinesis, which means they're wrapped in base64, etcetera. During processing I pull them out and then deserialize them to DynamoDBv2 records:
Reproduction Steps
This gist: https://gist.github.com/cdegroot/6e8e1957d9d08a595b3918dce3e16d84. Note that all I did was to add a couple of zeros to the test event.
Logs
Stacktrace:
Environment
dotnet test
etc.Resolution
I can look later, will just kill the last three digits with a regex for now, but the usual fix is to take a cut-off value (
100000000000
- somewhere in the 6th millenium) and if it's above, interpret as millisecs, if not, as secs.Of course, the proper fix is to make DynamoDB emit always one or the other :)
This is a 🐛 bug-report
The text was updated successfully, but these errors were encountered: