Profilbild von Konstantin Kudrinskii Data Engineer, Spark, Azure Databricks, Data Lake, Lakehouse Architect, PySpark aus FrankfurtamMain

Konstantin Kudrinskii

nicht verfügbar bis 30.06.2024

Letztes Update: 22.03.2024

Data Engineer, Spark, Azure Databricks, Data Lake, Lakehouse Architect, PySpark

Abschluss: Bachelor Degree. Applied Informatics and Mathematics. Probability Theory and Statistics.
Stunden-/Tagessatz: anzeigen
Sprachkenntnisse: deutsch (verhandlungssicher) | englisch (verhandlungssicher) | russisch (Muttersprache)

Dateianlagen

Azure-Solution-Architect-Data-Engineer-Certificate_271223.pdf
AWS-Certified-Data-Analytics-Specialty_271223.pdf
AWS-Certified-Solutions-Architect-Associate_271223.pdf
Goethe-Zertifikat-C1_020124.pdf
Databricks-Data-Engineer-Professional_110124.pdf

Skills

Expert in data engineering (Azure, Databricks, Spark, Hadoop, Kafka, AWS), software development (Java, Scala, Python), and DevOps (Terraform, Azure DevOps, AWS, and Gitlab CI/CD).
Proven track of successfully completed projects with demonstrated ability to provide highly efficient solutions in accordance with the well-architected principles and best practices (SOLID, TDD, CI/CD, Agile). Over a dozen of successfully architected and improved data analytics solutions, including 8 projects with Databricks.
I have the following certifications and constantly upgrade my knowledge:
- Azure Solutions Architect Expert
- Azure Data Engineer Associate
- AWS Certified Solution Architect Associate
- AWS Data Analytics Specialty.

Projekthistorie

10/2020 - bis jetzt
Data Architect, Data Engineer, MLOps, DevOps

Customer 1:

Development and operation of the Enterprise Data Lake (corporate Data Lakehouse) and Data Science Lab (reusable and secure solution, deployed for about 70 clients) automating infrastructure tasks and CI/CD pipelines.
  • Azure Databricks, Azure AD, Azure DevOps, ADLS Gen.2, Azure Data Factory, Azure Machine Learning, MLflow.
Improved DevOps pipelines and ML infrastructure for the commodity pricing prediction project.
  • Azure DevOps, Kubernetes, Terraform, Azure Databricks, Spark, Azure SQL, Seldon ML, Power BI, Grafana.
Developed the cost-efficient Advanced Cybersecurity Analytics solution that consumes large volumes of highly sensitive data and transforms it in real-time.
  • Kubernetes, Terraform, Azure DevOps, Confluent, Kafka, Logstash, Java, Rust.
Architecture and development of 3 data integration projects (B2B platform for cosmetics industry, production sites data analytics, operational data warehouse)
  • Azure Databricks, PySpark, Kafka, dbx, Azure Data Factory, Azure SQL, Synapse Analytics.
Customer 2:
Development of the corporate Data Lakehouse based on Azure Databricks, PySpark, Python, Airflow, Unity Catalog, dbx.
Development of the streaming data ingestion and processing using Kafka and Spark Structured Streaming.

Customer 3:
Building end-to-end data pipeline for the support system for medical diagnosis, architecting and developing the complete MLOps solution on Databricks and Google Cloud.
Optimizing performance for real time data processing.
- Azure Databricks, PySpark, GCP, Pubsub, Cloud Functions, Polars, MLflow, Streamlit.

06/2020 - 09/2020
Big Data Architekt
Ultra Tendency GmbH (Internet und Informationstechnologie, 50-250 Mitarbeiter)

Consulting one of the largest insurance companies in Germany to build Data Lake Cloud Concept based on Azure Databricks with a strong DevOps focus:

  • Databricks, Azure Data Factory, ADLS Gen.2, Azure AD
  • Documenting the concepts: DevOps, IAM, Cryptography, Business Continuity, Auditing, Logging. Building PoC for Terraform environments.

Interim Product Owner / Team Lead in the international IIOT project (GAIA-X):

  • Docker, Kubernetes, Java, Spring Boot, Apache Camel
  • Agile, Scrum.

01/2020 - 04/2020
Big Data Architekt
Adastra GmbH (Internet und Informationstechnologie, 50-250 Mitarbeiter)

Architecting the data integration framework on AWS with a strong focus on the best data management and DevOps practices:

  • Modularized and extensible implementation
  • Support of the most complex use cases in data management: Change Data Capture, Slowly Changed Dimension Type 2, schema changes
  • Auditing and data lineage
  • Support for CI/CD with test-driven development
  • Written in Python, PySpark, Pandas, Jinja2 templates, using Airflow to orchestrate ETL and maintenance jobs.

05/2016 - 12/2019
Data Engineering Tech Lead
360T (Deutsche Börse Group) (Banken und Finanzdienstleistungen, 50-250 Mitarbeiter)

  • Development and maintenance of the Big Data infrastructure (Cloudera Hadoop, Spark, Kafka, Impala, SQL, Scala, Python).
  • Architecting the new data processing solution in accordance with DataOps principles and requirements for auditing and data quality assurance.
  • The developed framework is based on the same technologies (Apache Spark) and architectural principles as AWS Glue, written in Java, Spring Boot, Spring MVC, Thymeleaf.
  • Overall improvements in the data infrastructure, including monitoring dashboards, custom Web applications, patching (JIRA, Python, and Bash scripts), quality assurance and CI/CD routines (JIRA, Bamboo), documentation.
  • Responsible for MIFID-II compliance reporting development, the project was successfully completed meeting the hard deadline.
  • Advanced analytics (SQL, Spark, Python, ML) on large and complex data sets (billions of records daily), both in batch and real-time.

03/2010 - 04/2016
Senior Java Developer / Data Analytics Team Lead
Phorm (Internet und Informationstechnologie, 50-250 Mitarbeiter)

  • Development of the digital advertising exchange platform as Senior Java EE Developer, both Back- (Java EE, Hibernate, Spring) and Front-End (Javascript).
  • Leading Data Analytics Team starting from January 2013, providing solutions for advanced SQL (Postgres DB) and multi-dimensional analysis (Pentaho Analytics, OLAP cubes).
  • Setting up the Cloudera Hadoop cluster (Oozie, Sqoop, Impala, Hive, Spark) to process large volumes of data.

Zertifikate

Databricks Certified Data Engineer Professional
2024
Goethe-Zertifikat C1
2022
Microsoft Certified: Azure Data Engineer Associate
2021
Microsoft Certified: Azure Solutions Architect Expert
2020
AWS Certified Data Analytics – Specialty
2020
AWS Certified Solutions Architect – Associate
2020

Reisebereitschaft

Verfügbar in den Ländern Deutschland
Profilbild von Konstantin Kudrinskii Data Engineer, Spark, Azure Databricks, Data Lake, Lakehouse Architect, PySpark aus FrankfurtamMain Data Engineer, Spark, Azure Databricks, Data Lake, Lakehouse Architect, PySpark
Registrieren