-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CPU heavy consumption #84
Comments
Hi. I just notice this, and your are totally right. It's like the moment we run client.start_stream_transcription, one core of the CPU gets completely occupied, and that will be a problem. @david-oliveira-br, have you found any solution to this? |
Hello! I noticed this same issue as well, where my CPU would spike to 100%. Interestingly enough, this only happens when building my code using When I use Stepping through the transcribe sdk code and monitoring my CPU usage, I found the exact spot where my CPU spikes. The method that triggers the spike is here: response = await self._session_manager.make_request(
signed_request.uri,
method=signed_request.method,
headers=signed_request.headers.as_list(),
body=signed_request.body,
) and more specifically, when stepping through that request call above, the spike happens when the stream is activated here: def _set_stream(self, stream: http.HttpClientStream):
if self._stream is not None:
raise HTTPException("Stream already set on AwsCrtHttpResponse object")
self._stream = stream
self._stream.completion_future.add_done_callback(self._on_complete)
self._stream.activate() # <- this call triggers the spike
def activate(self):
"""Begin sending the request.
The HTTP stream does nothing until this is called. Call activate() when you
are ready for its callbacks and events to fire.
"""
_awscrt.http_client_stream_activate(self) # <-- 100% CPU SPIKE HAPPENS HERE C code for the awscrt Also, this spike occurs before I even begin transcribing any audio! It happens the moment this stream is activated. Do you guys have any idea what's causing this? Thanks. |
Steps to recreate this issueIf you're on Note: Change the region in the transcribe client to the region you want to test with. from amazon_transcribe.client import TranscribeStreamingClient
import asyncio
async def start_stream():
transcribe_client = TranscribeStreamingClient(region="us-east-1")
transcribe_stream = await transcribe_client.start_stream_transcription(
language_code="en-US",
media_sample_rate_hz=16000,
media_encoding="pcm",
language_model_name=None,
vocabulary_name=None,
vocab_filter_method=None,
vocab_filter_name=None,
show_speaker_label=None,
enable_channel_identification=None,
number_of_channels=None,
enable_partial_results_stabilization=None,
partial_results_stability=None,
session_id=None,
)
# put a breakpoint here and look at your CPU usage.
print("put breakpoint here")
# loop so we can monitor cpu usage
while True:
pass
def main():
asyncio.run(start_stream())
if __name__ == "__main__":
main() Build the python code using dockerIf you're not using ubuntu, you can build the code using Docker. Put that python code above in a Note: Fill in your AWS creds in the dockerfile so it can authenticate with transcribe. FROM ubuntu:20.04
ENV AWS_ACCESS_KEY_ID=<AWS_ACCESS_KEY_ID>
ENV AWS_SECRET_ACCESS_KEY=<AWS_SECRET_ACCESS_KEY>
ENV AWS_SESSION_TOKEN=<AWS_SESSION_TOKEN>
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y software-properties-common && \
add-apt-repository -y ppa:deadsnakes/ppa
RUN apt-get install --no-install-recommends -y \
python3.9=3.9.16-1+focal1 \
python3-pip \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
RUN python3.9 -m pip install amazon-transcribe==0.6.1
WORKDIR /transcribe_high_cpu_test
COPY main.py ./
CMD ["python3.9", "main.py"] Now build the image and run the image while monitoring your CPU usage. docker build -t transcribe-cpu-usage-test . docker run transcribe-cpu-usage-test:latest |
Hi - curious if there will be any progress on resolving this issue? We're encountering this with our current version of python (3.7.10) and centos (7:0.3) |
Experiencing a similar issue to this -- activating more than 2 asynchronous streams and they seem to freeze on I'm on MacOS and Python 3.9 Has anyone found a solution? |
UPDATE Seems to work if I upgrade from Python 3.9 to 3.12 -- perhaps because this allows Python 3.9 Python 3.12 |
Same issue on Windows 10. When you run the Quick Start sample program (with a longer .wav file), one core goes to 100%. Using Process Explorer you can see that the offending thread has a Start Address _awscrt.pyd!!PyInit__awscrt+0xce650. I am using awscrt 0.16, Python 3.12.7. I tried a pip upgrade but I get "amazon-transcribe 0.6.2 requires awscrt~=0.16.0". It seems to have upgraded anyway, and the problem remains. There must be some busy looping going on somewhere - the code should mostly be just waiting for AWS to respond not busy looping. Anyone have a solution? |
Hello guys , during some simple local tests I noticed my cpu processing topping 100% while running examples (mic or audio file). I spent some hours reviewing the api in a attempt to find a possible bottleneck but didnt find anything relevant yet. Do you guys have some recommendation or thoughts about it? thanks in advance
The text was updated successfully, but these errors were encountered: