Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use aiobotocore 2.2.0 to support assume role credentials #157

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

gcavalcante8808
Copy link

Scenario

This PR bumps aiobotocore to its latest version (2.2.0), allowing to authenticate using IAM Roles/WebIdentity credentials with thumbor-aws.

What has been Done

  • chore: Updated tests and implementation to use aiobotocore 2.2.0.

@gcavalcante8808
Copy link
Author

Relates to: #156

Bucket=self._bucket,
Key=self._clean_key(path),
)
async with self.session.create_client('s3', region_name=self.region_name,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test this with a load-intensive process? We had performance issues in the past without the singleton client

Copy link
Author

@gcavalcante8808 gcavalcante8808 Apr 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some tests using hey client in that epoch, but I didn't hit any bad behavior (but didn't keep the results :( )

I remember that the client was a corutine itself ... but I'll run some tests and paste the results here to help with the analysis.

Originally, I was worried about opening the payload at https://github.com/thumbor-community/aws/pull/157/files#diff-8c5f6e09db7784ddba2fc0a87e8c9e5436275868ae07088bc1f5a1c888c45224R74-R81, but that hint about the client is warm as well.

I'll brb soon with the hey results.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any performance tests guidelines or number to make a comparison?

Here is the information about the tests that I made.

In my case, my CPU is the following:

processor	: 23
vendor_id	: AuthenticAMD
cpu family	: 23
model		: 113
model name	: AMD Ryzen 9 3900X 12-Core Processor
stepping	: 0
microcode	: 0x8701021
cpu MHz		: 2456.247
cache size	: 512 KB
physical id	: 0
siblings	: 24
core id		: 14
cpu cores	: 12
apicid		: 29
initial apicid	: 29
fpu		: yes
fpu_exception	: yes
cpuid level	: 16
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip rdpid overflow_recov succor smca sme sev sev_es
bugs		: sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass
bogomips	: 7585.13
TLB size	: 3072 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 43 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]

I ran the tests using docker on a Linux Machine in all cases. Each test was run three times and in my scenario, I have an S3 bucket served through minio and a redis cache:

version: '3'

volumes:
  redis-data:
  s3-data:
  thumbor-data:
  thumbor-logs:

services:
  s3:
    image: gcavalcante8808/minio-dev:latest
    environment:
      MINIO_ROOT_USER: minio
      MINIO_ROOT_PASSWORD: minio123
      MINIO_INITIAL_BUCKET: default
      MINIO_INITIAL_BUCKET_PERMISSION: none
    volumes:
      - s3-data:/data

  thumbor:
    image: apsl/thumbor:latest
    volumes:
      - thumbor-data:/data
      - thumbor-logs:/logs
    environment:
      - DETECTORS=['thumbor.detectors.queued_detector.queued_complete_detector']
      - STORAGE=thumbor.storages.mixed_storage
      - REDIS_STORAGE_SERVER_HOST=redis
      - REDIS_STORAGE_SERVER_PORT=6379
      - REDIS_STORAGE_SERVER_DB=0
      - REDIS_QUEUE_SERVER_HOST=redis
      - REDIS_QUEUE_SERVER_PORT=6379
      - REDIS_QUEUE_SERVER_DB=0
      - MIXED_STORAGE_DETECTOR_STORAGE=tc_redis.storages.redis_storage
      - S3_USE_SIGV4=false
      - LOADER=tc_aws.loaders.s3_loader
      - TC_AWS_REGION=us-east-1
      - TC_AWS_LOADER_BUCKET=default
      - TC_AWS_ENDPOINT="http://s3:9000"
      - AWS_ACCESS_KEY_ID=minio
      - AWS_SECRET_ACCESS_KEY=minio123
    ports:
      - 8080:8000

  new:
    image: thumbor:dev
    build: thumbor/
    volumes:
      - thumbor-data:/data
      - thumbor-logs:/logs
      - ./thumbor/thumbor.conf:/usr/src/thumbor.conf
    command:
      - thumbor
      - -c
      - /usr/src/thumbor.conf
    environment:
      - DETECTORS=['thumbor.detectors.queued_detector.queued_complete_detector']
      - STORAGE=thumbor.storages.mixed_storage
      - REDIS_STORAGE_SERVER_HOST=redis
      - REDIS_STORAGE_SERVER_PORT=6379
      - REDIS_STORAGE_SERVER_DB=0
      - REDIS_QUEUE_SERVER_HOST=redis
      - REDIS_QUEUE_SERVER_PORT=6379
      - REDIS_QUEUE_SERVER_DB=0
      - MIXED_STORAGE_DETECTOR_STORAGE=tc_redis.storages.redis_storage
      - S3_USE_SIGV4=false
      - LOADER=tc_aws.loaders.s3_loader
      - TC_AWS_REGION=us-east-1
      - TC_AWS_LOADER_BUCKET=default
      - TC_AWS_STORAGE_BUCKET=default
      - TC_AWS_ENDPOINT="http://s3:9000"
      - AWS_ACCESS_KEY_ID=minio
      - AWS_SECRET_ACCESS_KEY=minio123
    ports:
      - 9999:8888

  redis:
    image: redis:latest
    volumes:
      - redis-data:/data

Bellow, I post the results for both thumbor 6.3 and thumbor 7.0.7 with the new plugin.

Copy link
Author

@gcavalcante8808 gcavalcante8808 Apr 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thumbor 6.3.0 using apsl/thumbor docker image

This is an old but functional image (but with lots of critical CVEs though) using the following packages:

appdirs==1.4.3
backports-abc==0.5
boto==2.42.0
botocore==1.2.12
certifi==2017.4.17
colour==0.1.3
contextlib2==0.5.4
dateutils==0.6.6
derpconf==0.8.1
docutils==0.13.1
envtpl==0.4.1
futures==3.1.1
graphicsmagick-engine==0.1.1
itty==0.8.2
Jinja2==2.9.6
jmespath==0.9.2
libthumbor==1.3.2
MarkupSafe==1.0
numpy==1.11.0
opencv-engine==1.0.1
packaging==16.8
pexif==0.15
pgmagick==0.6.1
Pillow==3.4.2
pycrypto==2.6.1
pycurl==7.43.0
pylibmc==1.5.2
pymongo==3.4.0
pyparsing==2.2.0
pyremotecv==0.5.0
pyres==1.2
pystache==0.5.4
python-dateutil==2.6.0
pytz==2017.2
raven==5.15.0
redis==2.10.5
remotecv==2.2.1
requests==2.13.0
setproctitle==1.1.10
shortuuid==0.5.0
simplejson==3.10.0
singledispatch==3.4.0.3
six==1.10.0
statsd==3.2.1
tc-aws==6.0.2
tc-core==0.4.0
tc-mongodb==5.1.0
tc-redis==1.0.1
tc-shortener==0.2.2
thumbor==6.3.0
thumbor-memcached==5.1.0
tornado==4.5
tornado-botocore==1.1.0
virtualenv==15.1.0

The command hey -c 100 -z 30s http://localhost:8080/unsafe/300x200/smart/0864bf97-8369-42d7-ad8c-449541ea541c-original.png`, which emulates 100 clients during the 30s, yielded the following results:

Summary:
  Total:	33.6502 secs
  Slowest:	4.5284 secs
  Fastest:	0.2421 secs
  Average:	3.8583 secs
  Requests/sec:	24.5764

  Total data:	28952443 bytes
  Size/request:	35009 bytes

Response time histogram:
  0.242 [1]	|
  0.671 [3]	|
  1.099 [12]	|■
  1.528 [8]	|■
  1.957 [12]	|■
  2.385 [11]	|■
  2.814 [11]	|■
  3.242 [9]	|■
  3.671 [20]	|■■
  4.100 [512]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  4.528 [228]	|■■■■■■■■■■■■■■■■■■


Latency distribution:
  10% in 3.6563 secs
  25% in 3.9749 secs
  50% in 4.0372 secs
  75% in 4.1039 secs
  90% in 4.1648 secs
  95% in 4.1994 secs
  99% in 4.3036 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0017 secs, 0.2421 secs, 4.5284 secs
  DNS-lookup:	0.0008 secs, 0.0000 secs, 0.0317 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0011 secs
  resp wait:	3.8565 secs, 0.2397 secs, 4.5265 secs
  resp read:	0.0000 secs, 0.0000 secs, 0.0001 secs

Status code distribution:
  [200]	827 responses

During the tests, the CPU use was 100% (1 CPU) and RAM usage was about ~160MB in the first run, but was increasing by ~20MB on each test round, maybe indicating some sort of memory leak.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thumbor 7.0.7

This image has the following packages:

aiobotocore==0.12.0
aiohttp==3.8.1
aioitertools==0.10.0
aiosignal==1.2.0
async-timeout==4.0.2
attrs==21.4.0
botocore==1.15.15
cairocffi==1.3.0
CairoSVG==2.5.2
certifi==2021.10.8
cffi==1.15.0
cfgv==3.3.1
charset-normalizer==2.0.12
colorful==0.5.4
cssselect2==0.6.0
defusedxml==0.7.1
Deprecated==1.2.13
derpconf==0.8.3
distlib==0.3.4
docutils==0.15.2
filelock==3.6.0
frozenlist==1.3.0
identify==2.4.12
idna==3.3
jmespath==0.10.0
libthumbor==2.0.2
multidict==6.0.2
nodeenv==1.6.0
numpy==1.22.3
opencv-python-headless==4.5.5.64
packaging==21.3
Pillow==9.1.0
platformdirs==2.5.2
pre-commit==2.18.1
py3exiv2==0.7.1
pycparser==2.21
pycurl==7.45.1
pyparsing==3.0.8
pyres==1.5
python-dateutil==2.8.2
pytz==2022.1
PyYAML==6.0
redis==4.2.2
remotecv @ git+https://github.com/thumbor/remotecv@58f46eaa8ffe4e83c5afe2ea04397da8d8834a7b
sentry-sdk==0.14.4
setproctitle==1.2.3
simplejson==3.17.6
six==1.16.0
socketfromfd==0.2.0
statsd==3.3.0
tc-aws==7.0b0
tc-redis @ git+https://github.com/thumbor-community/redis@e4dea465e1f388173083143dbc0942caa143ef48
thumbor==7.0.7
tinycss2==1.1.1
toml==0.10.2
tornado==6.1
typing-extensions==4.2.0
urllib3==1.25.11
virtualenv==20.14.1
webcolors==1.11.1
webencodings==0.5.1
wrapt==1.14.0
yarl==1.7.2

The command hey -c 100 -z 30s http://localhost:9999/unsafe/300x200/smart/0864bf97-8369-42d7-ad8c-449541ea541c-original.png, which emulates 100 clients during the 30s, yielded the following results:

Summary:
  Total:	32.8143 secs
  Slowest:	3.1669 secs
  Fastest:	0.0409 secs
  Average:	2.6795 secs
  Requests/sec:	35.7161

  Total data:	41030548 bytes
  Size/request:	35009 bytes

Response time histogram:
  0.041 [1]	|
  0.353 [10]	|
  0.666 [11]	|
  0.979 [10]	|
  1.291 [11]	|
  1.604 [11]	|
  1.917 [11]	|
  2.229 [12]	|
  2.542 [11]	|
  2.854 [978]	|■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  3.167 [106]	|■■■■


Latency distribution:
  10% in 2.7035 secs
  25% in 2.7624 secs
  50% in 2.7868 secs
  75% in 2.8109 secs
  90% in 2.8479 secs
  95% in 2.8790 secs
  99% in 2.9021 secs

Details (average, fastest, slowest):
  DNS+dialup:	0.0011 secs, 0.0409 secs, 3.1669 secs
  DNS-lookup:	0.0004 secs, 0.0000 secs, 0.0288 secs
  req write:	0.0000 secs, 0.0000 secs, 0.0009 secs
  resp wait:	2.6784 secs, 0.0388 secs, 3.1652 secs
  resp read:	0.0001 secs, 0.0000 secs, 0.0002 secs

Status code distribution:
  [200]	1172 responses

This time, the RAM Usage was near ~82MB of RAM and didn't change during other test rounds.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the performance tests.
We did load testing before so it was on "live" servers with a lot of simulated users, so not really comparable, but here a wait time of ~3/4secs looks quite slow.
Ideally we should do some load testing on a live server with AWS S3, and do a before (Thumbor 6) / current (Thumbor 7 with latest tc_aws without your PR) / after (Thumbor 7 + your PR) to check the improvements.

@Bladrak
Copy link

Bladrak commented Dec 7, 2022

@gcavalcante8808 hi, did you try this on a live server? Did it handle the load properly?

@gcavalcante8808
Copy link
Author

@Bladrak Well, as I ended up leaving the company I was working for, I can only talk about the period I was there: it ran for 6 months even in production without complications =D

@Bladrak
Copy link

Bladrak commented Dec 7, 2022

Ok great :) Would you mind switching the target branch to master and rebasing this? I think we will be able to merge it now!

@gcavalcante8808 gcavalcante8808 changed the base branch from py3 to master December 7, 2022 16:41
setup.py Outdated
'thumbor>=7.0.0a2,<8',
'aiobotocore==2.2.0',
'boto3>=1.9,<1.13',
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to update those version to match what aiobojtocore needs. I've restricted the range to reduce the build time.

setup.py Show resolved Hide resolved
@Bladrak
Copy link

Bladrak commented Dec 8, 2022

@gcavalcante8808 there seems to be an issue with Circle CI from your fork. It may be related to https://circleci.com/docs/oss/#build-pull-requests-from-forked-repositories

If a user submits a pull request to your repository from a fork, but no pipeline is triggered, then the user most likely is following a project fork on their personal account rather than the project itself of CircleCI, causing the jobs to trigger under the user’s personal account and not the organization account. To resolve this issue, have the user unfollow their fork of the project on CircleCI and instead follow the source project. This will trigger their jobs to run under the organization when they submit pull requests.

Can you check that this is right for your fork? And update accordingly if need be?

@Bladrak
Copy link

Bladrak commented Dec 12, 2022

Hi @gcavalcante8808 could you check out the CircleCI issue?

@oliverschewe
Copy link

Any updates on this? I would also like to use assumeRole via AWS Web Identity Token and it looks like old boto version does not support it.

I am currently getting follow errors.

Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/botocore/credentials.py", line 502, in _protected_refresh metadata = self._refresh_using() File "/usr/local/lib/python3.9/site-packages/botocore/credentials.py", line 643, in fetch_credentials return self._get_cached_credentials() File "/usr/local/lib/python3.9/site-packages/botocore/credentials.py", line 654, in _get_cached_credentials self._write_to_cache(response) File "/usr/local/lib/python3.9/site-packages/botocore/credentials.py", line 679, in _write_to_cache self._cache[self._cache_key] = deepcopy(response) File "/usr/local/lib/python3.9/copy.py", line 161, in deepcopy rv = reductor(4) TypeError: cannot pickle 'coroutine' object 2024-01-24 17:09:42 thumbor:ERROR [BaseHander.execute_image_operations] cannot pickle 'coroutine' object Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/thumbor/handlers/__init__.py", line 146, in execute_image_operations result = await self.context.modules.result_storage.get() File "/usr/local/lib/python3.9/site-packages/tc_aws/result_storages/s3_storage.py", line 55, in get key = await super(Storage, self).get(path) File "/usr/local/lib/python3.9/site-packages/tc_aws/aws/storage.py", line 53, in get return await self.storage.get(file_abspath) File "/usr/local/lib/python3.9/site-packages/tc_aws/aws/bucket.py", line 75, in get return await self._client.get_object( File "/usr/local/lib/python3.9/site-packages/aiobotocore/client.py", line 76, in _make_api_call http, parsed_response = await self._make_request( File "/usr/local/lib/python3.9/site-packages/aiobotocore/client.py", line 96, in _make_request return await self._endpoint.make_request(operation_model, request_dict) File "/usr/local/lib/python3.9/site-packages/aiobotocore/endpoint.py", line 68, in _send_request request = self.create_request(request_dict, operation_model) File "/usr/local/lib/python3.9/site-packages/botocore/endpoint.py", line 115, in create_request self._event_emitter.emit(event_name, request=request, File "/usr/local/lib/python3.9/site-packages/botocore/hooks.py", line 356, in emit return self._emitter.emit(aliased_event_name, **kwargs) File "/usr/local/lib/python3.9/site-packages/botocore/hooks.py", line 228, in emit return self._emit(event_name, kwargs)

@Bladrak
Copy link

Bladrak commented Feb 5, 2024

Hi @oliverschewe it seems this PR is a bit outdated. If you'd like to retake the PR and submit an updated one, I'll be happy to review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants