Toni Gruetze
Data Scientist, Architect, consultant
Toni is a computer scientist with years of experience on Distributed Systems (Hadoop, Spark), as well as machine learning and artificial intelligence. He organized several seminars on the topic of Data Mining with Distributed Systems. Furthermore, he collaborated with small and large companies in various projects. His strength is accompanying Data Mining projects from planning to production and scale complex analyses on cluster systems. He also published various peer-reviewed papers in the research areas of Text and Web Mining.
Jetzt Einstellen
Hauptfähigkeiten
Technische Fähigkeiten
Erfahrung in Jahren
Fertigkeit
Machine learning
10 Erfahrung in Jahren
10
Data Integration
10 Erfahrung in Jahren
10
SQL
10 Erfahrung in Jahren
10
Python
5 Erfahrung in Jahren
5
Java
10 Erfahrung in Jahren
10
Arbeitserfahrung
Data Engineer & Software Architect
2017 - aktuell
Freelancer at German Universal Bank
- - Analyzed and aligned with business entities in a graph database - Developed a web app to explore the resulting news graph - Topics: NLP, Machine Learning, Reinforcement Learning, Data Integration - Technologies: Python, Java, JanusGraph, Kafka, Docker, GCP, Spring Boo
Data Engineer & Software Architect: Negative News Screening
2017 - 2019
German Universal Bank
- A stream of news articles was analyzed and aligned to business entities in a graph database. Furthermore, a web app was developed to explore the resulting news graph. Tasks: - Design ETL pipelines to integrate business entities and article texts based on Apache Kafka - Implement NLP modules for news articles to find business entity mentions (named entity recognition and linking, sentiment analysis, document classification) - Create a concept for data lineage of ETL pipeline steps - Design graph data model - Optimize exploratory web app queries in graph database - Develop REST API for the graph explorer web app Impact: - Developed first cloud computing pilot and introduced it to customer - Designed successful demo for the risk management team of the supervisory board - Trained web app team to use Spring Boot and JanusGraph - Assisted Data Science team to develop advanced matching models Topics: NLP, Machine Learning, Data Integration, Data Modeling Technologies: Python, Java, Apache Kafka, GCP, Docker, spaCy, Gremlin, JanusGraph, Apache Cassandra, Elasticsearch, Spring Boot
Software Architect and Team Lead: Data Ingestion and Analysis
2016 - 2017
Commerzbank AG
- Development of a system to build, curate, explore and analyze domain-specific knowledge graphs from structured and unstructured data sources. Topics: Distributed Computing, Duplicate Detection, NLP, Machine Learning Technologies: Scala, Apache Spark, Apache Cassandra, React, Jenkins, sbt
Researcher: Text Mining
2012 - 2017
Hasso Plattner Institue
- The research focus laid on showing that knowledge represented in user-generated content originating from various social media services can be used to significantly improve various natural language processing and text mining tasks. Topics: Text Mining, NLP, Machine Learning Technologies: Java, Python, R, Spacy, scikit-learn, Weka, Keras, pandas, ggplot2, PostgreSQL
Team Lead: Big Data Analytics for Health Data
2009 - 2017
Hasso Plattner Institute
- - Optimization of the medical care with a system tailored to analyze large historical treatment data from health insurances - Evaluation of different platforms with respect to their scalability - Project Partner: Elsevier Health Analytics - Topics: Big Data Analytics, Distributed Computing, Data Warehouse, Data Mining - Technologies: Java, HPCC, PostgreSQL, i2b2
Software Architect and Team Lead: Data Ingestion and Analysis
2009 - 2017
Hasso Plattner Institue
- - Development of a system to build, curate, explore and analyze domain-specific knowledge graphs from structured and unstructured data sources - Project Partner: Commerzbank AG - Topics: Distributed Computing, Duplicate Detection, NLP, Machine Learning - Technologies: Scala, Apache Spark, Apache Cassandra, React
Researcher
2009 - 2017
Hasso Plattner Institue
- - Processed research focused on showing that knowledge represented in user-generated content originating from various social media services can be used to significantly improve various natural language processing and text mining tasks - A selection of additional lectures and projects in cooperation with industrial partners are listed further
Lecturer: Mining Massive Datasets, Seminar
2009 - 2017
Hasso Plattner Institute
- - Students had to approach challenging big data problems using distributed computing frameworks like Apache Spark or Apache Flink and Amazon Web Services - Topics: Big Data, Machine Learning, Data Mining - Technologies: Apache Spark, Apache Flink
Lecturer: Distributed Big Data Analytics, Seminar
2009 - 2017
Hasso Plattner Institue
- - Each student group had to compare the performance of the two distributed computing frameworks Apache Spark and Apache Flink for one challenging big data problem (e.g., Graph Mining, Text Mining, etc.) - Topics: Distributed Computing, Big Data Analytics, Data Mining - Technologies: Apache Spark, Apache Flink
Lecturer: Mining Massive Datasets
2016 - 2016
Hasso Plattner Institute
- Students had to approach challenging big data problems using distributed computing frameworks like Apache Spark or Apache Flink and Amazon Web Services Topics: Big Data, Machine Learning, Data Mining Technologies: Scala, Java, Apache Spark, AWS
Team Lead: Big Data Analytics for Health Data
2014 - 2015
Elsevier Health Analytics
- Optimize the medical care with a system tailored to analyze large historical treatment data from health insurances. Evaluate different platforms with respect to their scalability. Topics: Big Data Analytics, Distributed Computing, Data Warehouse, Data Mining Technologies: Java, R, HPCC, PostgreSQL, i2b2
Lecturer: Distributed Big Data Analytics
2015 - 2015
Hasso Plattner Institue
- Each student group had to compare the performance of the two distributed computing frameworks Apache Spark and Apache Flink for one challenging big data problem (e.g., Graph Mining, Text Mining, etc.) Topics: Distributed Computing, Big Data Analytics, Data Mining Technologies: Scala, Java, Apache Spark, Apache Flink, AWS
Software Developer
2006 - 2009
Decision Optimization
- - Enabling preventive maintenance decisions for gene analysis equipment by training machine learning models that predict failures - Project Partner: SigmaQuest, Inc. - Topics: Machine Learning, Predictive Maintenance - Technologies: Java, Weka, Oracle
Software Developer: Preventive Maintenance
2006 - 2009
Decision Optimization
- Enabling preventive maintenance decisions for gene analysis equipment by training machine learning models that predict failures Project Partner: SigmaQuest, Inc. Topics: Machine Learning, Predictive Maintenance Technologies: Java, Weka, Oracle
Software Developer
2005 - 2006
Siemens R&D
- - Managing guidelines, tolerances and limits of complex steam turbines for the use in engineering and simulation applications - Topics: Information Management - Technologies: C#, Oracle
Ausbildung & Zertifikate
Doctor of Engineering - Information Systems (Dr.-Ing. / Ph.D.)
2012 - 2018
Hasso Plattner Institute, University of Potsdam
Doctor of Engineering - Information Systems (Dr.-Ing. / Ph.D.)
2012 - 2018
Hasso Plattner Institute, Univerity of Potsdam
Master of Science - IT-Systems Engineering (M.Sc.)
2009 - 2011
Hasso Plattner Institute
Master of Science - IT-Systems Engineering (M.Sc.)
2009 - 2011
Hasso Plattner Institute, University of Potsdam
Dipl.-Inf. (FH)
2003 - 2008
Hochschule Zittau/Görlitz
Sprachen
English
Native or bilingual
German
Professional working
French
Elementary
Greek
Elementary
NOCH NICHT GEFUNDEN, WAS DU SUCHST?
Dann sende uns bitte eine Nachricht. Gerne beantworten wir deine Fragen!
SCHREIB UNS
Beliebte Entwickler-Suchen
Android Developer
AWS Expert
Product Manager
Java Developer
Frontend Developer
Ruby on Rails Developer
iOS Developer
Javascript Developer
Python Developer
Full Stack Developer
SQL Developer
Wordpress Developer
Web Developer
App Developer
WooCommerce Developer
Drupal Developer
Laravel Developer
Angular Developer
HTML Developer
Swift Developer
Objective-C Developer
C# Developer
C++ Developer
C Developer
Developer
React.js Developer
CSS Developer
Backend Developer
PHP Developer
Redux Developer
Vue.js Developer
jQuery Developer
Node.js Developer
Django Developer