packages = ["pyflame", "opentelemetry-distro"] terminal = false [[fetch]] from = "./load" files = ["parsing-document-in-cpu-intensive-application.py"] [[fetch]] from = "./load" files = ["ram_intensive_program.py"] [[fetch]] from = "./load" files = ["ram_intensive_dummy_program.py"] [[fetch]] from = "./" files = ["calls.py"] [[fetch]] from = "./" files = ["otel_helpers.py"]

Sustainable Python Performance

Uncovering the root causes

First almost white screen

With PyScript you can use a, d keyboard buttons to move left, right

Sustainable Python Performance

Uncovering the root causes

Second almost white screen

With PyScript you can use a, d keyboard buttons to move left, right

Sustainable Python Performance

Uncovering the root causes

By Alex Ptakhin

Tech Lead at Prestatech GmbH, Berlin

Latest slides

Agenda

Agenda

Who at least once used timeit, time.perf_counter(), CPU or memory usage profilers?

Who at least once used timeit, time.perf_counter(), CPU or memory usage profilers?
Image by Foundry Co from Pixabay

htop

Temporary solution

Temporary solution

Scale-up: more CPU, more RAM

Now we have time to debug

CPU

time.perf_counter

import time from calls import cpu_intensive_call start = time.perf_counter() cpu_intensive_call(num_iterations=5000000) end = time.perf_counter() print('Elapsed seconds: {:.1f}'.format(end - start))

time.perf_counter

time.perf_counter

cProfile

import cProfile import re from calls import cpu_intensive_call cProfile.run('cpu_intensive_call(num_iterations=5000000)')

cProfile

$ python -m cProfile \
    -o out/cpu-intensive-program.prof \
    load/cpu-intensive-program.py
$ snakeviz out/cpu-intensive-program.prof

cProfile

cProfile

CPU Profilers

CPU Profilers

CPU Profilers

py-spy

$ py-spy record -o out/py-spy.svg -- python load/cpu-intensive-program.py
Sampling profilers - gets traces after

py-spy

py-spy

Bonus: yappi for asyncio

$ python load/asyncio_yappi.py
Also very interesting profiler. Supports asynchronous execution

Bonus: yappi for asyncio

$ python load/asyncio_yappi.py > out/asyncio_yappi.txt
$ snakeviz out/asyncio_yappi.prof

Bonus: yappi for asyncio

Problem found

Problem found

After a few days

After a few days failing processes and 500.

htop

RAM

Temporary solution

Temporary solution

Restart every N requests

Temporary solution

Restart every N requests
Might be also good for the permanent solution :)

sys.getsizeof

import sys print(f'Empty dict size: {sys.getsizeof({})}') print(f'Empty list size: {sys.getsizeof([])}') print(f'Empty set size: {sys.getsizeof(set())}')

sys.getsizeof

import sys print(f'Empty list size: {sys.getsizeof([])}') lorem = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aliquam vitae nisl nisi. Donec malesuada luctus diam ac lacinia. Suspendisse porta dolor sem, id semper nibh tempor a. Proin porttitor nulla nec risus sollicitudin semper. Sed at lectus ante. Curabitur venenatis interdum malesuada. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed ut nisl rhoncus, laoreet diam et, blandit elit. Maecenas non quam dictum, ullamcorper massa ac, egestas tortor. Suspendisse venenatis leo nisl, vel mollis turpis consequat nec. Suspendisse lobortis auctor ante id condimentum. In porta, dui ultricies placerat dapibus, lorem ante euismod mi, et pretium lectus lorem fringilla mauris. Mauris aliquet, odio ac euismod mollis, lacus dolor accumsan velit, eu dignissim felis arcu eu ex. Nunc consectetur et sapien non iaculis. Sed dictum tellus velit.' print(f'List with long string size: {sys.getsizeof([lorem])}')

tracemalloc

import tracemalloc def ram_intensive_dummy_call() -> None: a = [1] * (10 ** 6) b = [2] * (2 * 10 ** 7) del b return a tracemalloc.start() snapshot1 = tracemalloc.take_snapshot() ram_intensive_dummy_call() snapshot2 = tracemalloc.take_snapshot() top_stats = snapshot2.compare_to(snapshot1, 'lineno') print("[ Top 10 differences ]") for stat in top_stats[:10]: print(stat)

memory-profiler

$ poetry add memory_profiler
$ python -m memory_profiler load/memory_profiler.py

memory-profiler

memory-profiler

memray

$ poetry add memray
$ memray run -o out/memray.bin load/ram-intensive-program.py
$ memray flamegraph out/memray.bin
$ # ... out/memray-flamegraph-memray.html

memray

memray

IO

General advices

General advices

But it's not the whole story

Problems continue happening

Follow-up: what to do on regular basis?

Follow-up: what to do on regular basis?

Follow-up: what to do on regular basis?

Follow-up: what to do on regular basis?

Tracing

Open Telemetry

from calls import cpu_intensive_call from opentelemetry import trace tracer = trace.get_tracer(__name__) if __name__ == '__main__': with tracer.start_as_current_span("cpu_intensive_call") as child: cpu_intensive_call(num_iterations=5000000)

Open Telemetry

from otel_helpers import catchtime, init_otel from opentelemetry import trace, metrics from calls import cpu_intensive_call init_otel() tracer = trace.get_tracer(__name__) meter = metrics.get_meter(__name__) execution_time_hgram = meter.create_histogram('execution_time') with tracer.start_as_current_span("cpu_intensive_application") as parent: for x in range(3): with tracer.start_as_current_span("cpu_intensive_call") as child, catchtime() as t: cpu_intensive_call(num_iterations=5000000) execution_time_hgram.record(t())

Open Telemetry

Multiply vendors, e.g. Grafana.

Alternatives

Grafana Stack: Loki, Prometheus.

Alternatives

Grafana Stack: Loki, Prometheus.

Cloud intrumentation.

3 things to remember

3 things to remember

3 things to remember

3 things to remember

3 4 things to remember

Thank you! Questions?

Gratitudes

By Alex Ptakhin, Tech Lead at Prestatech GmbH, Berlin. [email protected] / github.com/aptakhin / twitter.com/aptakhin / hachyderm.io/@AlexPtakhin / linkedin.com/in/aptakhin

Latest slides

https://aptakhin.name/talks/2023-Sustainable-Python-Performance/

Secret reference slide