Mark Andreev (Senior Software Engineer)

// Contacts mark.andreev@gmail.com Linkedin Github
// CV: English

// Blog Deep dive into Apache Parquet Format How test Java code structure in unit tests (archunit) Java Two Way SSL Client (+ Spring example) [ All snippets, External publications ]

// Talks "ML in production" at the FunTech ML-meetup "Streaming vs Batching" at the Conundrum Meetup

// Projects [App] fix parser [App] .gitignore generator [Demo] Time series player [Demo] Offline lock with Redis Bigdata Indicators Tornado-swagger EnduringNet

// Certificates CockroachDB Query Performance for Developers Redis for Java Developers ScyllaDB. Data Modeling and Application Development AWS Well-Architected Training Deep Dive into AWS S3, Glacer, EFS Deep Dive on Container Security

// Development Stack Java. Spring: MVC, Data, AMQP, Kafka, Integration, Batch, Security, State machine, Apache Camel, Vert.x, GraalVM. Python. Pandas, Scikit-learn, XGBoost, LightGBM, Catboost, Matplotlib, Tornado, FastAPI, Flask Data Processing. Spark, Flink, Cassandra, Hadoop, Kafka, PostgreSQL. Third party. Oracle Database, PostgreSQL, Clickhouse, Kafka, Keycloak, MongoDB, RabbitMQ, Redis, Prometheus, Docker, Kubernetes, Helm, Linux, Airflow Cloud. AWS {EC2, S3, RDS, CloudFront, SQS, SNS, Lambda, Batch, IAM, Registry}; Azure {VM, BLOB, Registry}

// Experience summary

Senior Software Engineer specialising in platform optimisation for ML/MLOps and trading systems, with hands‑on impact in AHL’s OMS through asset‑class onboarding, user‑facing latency reductions.

Technical breadth spans Java and Spring, Kafka, ClickHouse, Oracle and PostgreSQL, Python, Kubernetes, Prometheus/Grafana, Keycloak, and Linux, with additional experience in gRPC, and deployments on AWS and Azure.

Active open‑source contributor to Apache Spark, Apache Airflow, Apache Ignite, Apache Camel, Keycloak, Vaadin Flow, and Tornado Swagger, with work ranging from Docker/Kubernetes operator enhancements to target‑encoding and CatBoost inference integration.

Holds a Master’s in Applied Mathematics and Informatics from Lomonosov Moscow State University, with thesis and publications in macroeconomic monitoring and inflation expectations.

// Experience Man Group plc, 2024 - now Senior Software Engineer, May 2024 - now

Onboarded new asset classes and flows into the trading pipeline
Optimised Oracle SQL*Plus queries in the user-facing flow through manual analysis of Autotrace | Most critical query - decreased latency by 60%.
Optimised the user-facing flow through manual analysis of AsyncProfiler | Decreased latency by 70%
Adopted Vaadin for internal developer-focused UI | Moved teams off local snippets and improved cross‑team knowledge sharing

[Java, Oracle Database, Kafka, Python, Linux]

Conundrum.ai, 2017 - 2024 Senior Software Engineer, Sep 2022 - May 2024

Implemented low-level optimization for Python SDK that helps integration with the platform | Decreased RAM usage by 30%
Implemented low-level optimization for feature store on top of Kafka & ClickHouse (projections, application-level query planner) | Reduced execution time of the most popular queries by 63%
Covered 80% of queries to eliminate major performance degradations (Gatling)
Created performance optimizations for Kafka subscription proxy (Java 21 virtual threads, shared subscription) | Decreased CPU load by 80%
Implemented security improvements for Platform (audit, L4 network policies, L7 network filter) | Apply security IS requirements at network level
Instrumented platform’s services with health performance metrics (Prometheus, Grafana, Alerts) | Decrease issue investigation time by 64%

[Java, Python, Query optimizations, Load tests, Kafka, Clickhouse, Kubernetes, Feature store, Spring, Vert.x, GraalVM]

Senior Software Engineer, Sep 2019 - Aug 2022

Developed microservices with Spring Boot, PostgreSQL and gRPC
Migrated feature store to Kafka & ClickHouse (columnar OLAP database) | Decreased query execution time by 90%
Created low-level connectors for Industrial Data Exchange formats (MQTT, OPC UA, Historian) | Decreased CPU load to exchange server by 68%
Migrated the model-serving runtime to Kubernetes (Kubernetes API, Helm)
Deployed platform to AKS (Azure Cloud) & K3s (on-premises, no internet)
Migrated to Keycloak (SSO) for Authz

[Java, Python, Kubernetes, Helm, Kafka, Clickhouse, Azure, AKS, OLAP, Feature store, Spring]

Mid-level Software engineer, Nov 2017 - Sep 2019

Created feature store for sensors’ time series data (Java, Spring, PostgreSQL, TimescaleDB)
Created model serving runtime server (Python, Processes)
Created incident management service (Java, Spring, Spring State Machine)
Created ETL pipelines using S3, SQS, S3-SFTP

[Java, Spring, Python, AWS, ETL, Feature store, PostgreSQL, TimescaleDB, State machines, S3, SQS, S3 SFTP]

Junior Machine learning engineer, May 2017 - Oct 2017

Airline data clustering; created an approach for data splitting for offline A/B tests
Telecom data churn; offline churn scoring based on telecom data activity
Web data gender detection; offline gender detection based on web activity
Mobile data geo-analysis; created reports about geolocation activity based on mobile location data
Industrial time-series data; created data pipeline for failure prediction

[Python, PySpark, Spark, SQL, Scikit learn, Pandas, XGBoost]

Big Data Indicators Internship, Oct 2016 - May 2017 Machine Learning Engineer

Created data collection & processing pipeline
Used topic models to discover trends
Created sentiment analysis models for trend prediction

[Data mining, Python, MongoDB, Redis, Machine learning, NLP, Topic models]

// Education Lomonosov Moscow State University Master's degree. Computational Mathematics and Cybernetics Moscow Power Engineering Institute Bachelor's degree. Institute of automatics and computer science

// Contribute to Open Source

Apache Spark
- [SPARK-49044][SQL] ValidateExternalType should return child in error.
- [SPARK-49490][SQL] Add benchmarks for initCap.
- [SPARK-49549][SQL] Assign a name to the error conditions _LEGACY_ERROR_TEMP_3055, 3146.
Apache Airflow
- [AIRFLOW-43853] Add logging support for init containers in KubernetesPodOperator (#42498)
- [AIRFLOW-43847] Add random_name_suffix to SparkKubernetesOperator (#43800) (#43847)
- [AIRFLOW-43840] Fix logs with leading spaces in the Docker operator (#33692) (#43840)
- [AIRFLOW-27553] Add ipc_mode for DockerOperator (#27553)
Apache Ignite
- [IGNITE-13713] Implemented target encoding preprocessor.
- [IGNITE-13714] Implemented catboost inference integration.
- [IGNITE-13386] Implemented new distances (BrayCurtis, Canberra, JensenShannon and etc).
Keycloak
- [KEYCLOAK-19743] Fix null username in ldap.
Apache Camel
- [CAMEL-16092] Add fix for header override by Azure Storage Blob download consumer.
Clickhouse
- [ClickHouse-37228] Remove group.id from StorageKafka::createWriteBuffer.
Vaadin Flow
- [FLOW-20058] fix: add nodeVersion in gradle plugin settings.
Tornado Swagger
- [REPOSITORY] Swagger API Documentation builder for tornado server.

// Publications A New Approach to Determining the Attitude of Authors of Short Texts to the Topics Discussed in the Texts on the Example of Estimating the Inflations Expectations, Oct 2017 DATA ANALYTICS AND MANAGEMENT IN DATA INTENSIVE DOMAINS, DAMDID / RCDL’2017, Andreev M. Big Data approach to measure inflation expectations: the case of the Russian economy, Jul 16, 2017 IFABS 2017 Oxford Conference, Goloshchapova I., Andreev M. Measuring inflation expectations ofthe Russian population with the help of machine learning. Voprosy Ekonomiki. 2017;(6):71-93. (In Russ.), Goloshchapova I., Andreev M.

// Social Media