Diogo Aurelio


Currently working as a freelance Data Engineer & Architect in projects for Startups and Enterprises mainly in Germany as part of the GoSmarten group. Gosmarten is a collective of engineers with extensive experience in different aspects of software architecture, data analysis and processing, providing consulting and hands-on professional services. We gather skills in key Big Data areas: Data lake building (Hive, Presto, Pig, MapReduce, Spark, Flink, Beam, Redshift), real time distributed applications (Akka, RabbitMQ, Kafka, Kinesis, Play, ElasticSearch), DevOps (docker, kubernetes, AWS), Machine Learning (scikit learn, Spark mllib, tensorflow, DL4J), programming in Java, Scala, Python, R, Node.js, etc. I also hold a tech blog (https://datacenternotes.com/), lately mainly focused on big data topics.

Germany, Berlin
Product Design
Machine learning
Apache Hadoop
Lambda Architecture
+ 5 more
Diogo ist derzeit nicht verfügbar.

Jetzt Einstellen


Technische Fähigkeiten

Erfahrung in Jahren



8 Erfahrung in Jahren



4 Erfahrung in Jahren



Data Engineer

2014 - 2015

PricePanda Group

  • - Hadoop Cluster (HortonWorks): cluster setup, DistCp + MR + Pig jobs; - DevOps: Migration & Setup of Infrastructure on AWS, Continuous Integration & Deployment, Docker - Business Intelligence team: supported BI team developing R based webserver, ETL scripts, and deployment - Internal Node.js Crawler: dev & deploy NodeJS crawler - Tech-stack: Data Engineering (Hadoop HortonWorks/Spark, Postgres, S3), DevOps (Python, Bash Scripting), Cloud (AWS, CentOS, Ubuntu), CD/CI (Jenkins, Docker)

Start-up Founder

2013 - 2014

About IT

  • - About IT was a marketplace for client to client IT consulting, where independent opinions, up to date hands-on experience, and more cost efficient advising could be found - Tech stack: Ruby on Rails, AWS (ec2, elb, rds postgres), DevOps (Chef, Capistrano, bash, python)

Data Enginner

2008 - 2013


  • - Developing Big Data projects as freelance developer and architect for German Startups & Enterprises - Clients: Bonial International Group GmbH (Berlin), Lesson 9 GmbH - Babbel (Berlin), Orderbird GmbH, Otto GmbH & co (Hamburg); - Akka Streaming (Scala) project for real time push notification sending - Contribution to Open Source project for Hadoop Hive Datalake Management and Scheduling based on Scala Akka - https://github.com/ottogroup/schedoscope - Machine Learning PoCs for Retention prediction and recommendation system evaluation for retention improvement - Spark + AWS.{Lambda, Gateway, DynamoDB} - Setup batch and streaming pipelines (lambda architecture) for event processing and storage ( in the so called "data lake") with Spark, S3, HDFS Avro/Parquet, Hive, Schedoscope (Otto Group own open-source Scala pipeline scheduler), Oozie, AWS Datapipeline; - Support BI in SQLServer migration to AWS Redshift and visualization tools; - Setup "Databricks"-like environment for data engineers & data scientists & BI analysts (based on IPython Notebook with Spark on EMR) and coaching for usage; - DevOps mainly in AWS environment and Hadoop cluster setup;

Pre-sales Technical Consultant Datacenter Solutions

2009 - 2013

Hewlett and Packard

  • - Focused on Data Center Solutions (Computing Infrastructures, NAS/SAN, Backup Storage infrastructures, and Networking), as well as Corporate LAN environment (WLAN, Wired, and Security) - Solution designing and sizing - Proposal writing and presentation to final client - Post-Sales support

Ausbildung & Zertifikate

Online Course Scalable Machine Learning with Apache Spark

2015 - 2015

Berkeley University of California

Online Course Introduction to Big Data with Apache Spark

2015 - 2015

Berkeley University of California

Online Course Machine Learning

2015 - 2015

Stanford University

Online Course Statistical Learning

2015 - 2015

Standford University

Online Course Statistical Inference

2014 - 2014

John Hopkins Bloomberg School of Public Health

Online Course Practical Machine Learning

2014 - 2014

John Hopkins Bloomberg School of Public Health

Paper Publication: A Framework for Evaluating Lean Implementation Appropriateness

2011 - 2011

Industrial Engineering

2004 - 2009

Universidade Nova de Lisboa Faculdade de Ciencias e Tecnologia



Native or bilingual


Full Professional


Native or bilingual


Dann sende uns bitte eine Nachricht. Gerne beantworten wir deine Fragen!