Kun Lu nicht verfügbar bis 31.07.2020

Kun Lu

NLP, Data Science; Machine Learning; Text Mining; Python; Spark

nicht verfügbar bis 31.07.2020
Profilbild von Kun Lu NLP, Data Science; Machine Learning; Text Mining; Python; Spark aus Munich
  • 81549 Munich Freelancer in
  • Abschluss: Dr.-Ing.
  • Stunden-/Tagessatz: nicht angegeben
  • Sprachkenntnisse: chinesisch (Muttersprache) | deutsch (verhandlungssicher) | englisch (verhandlungssicher)
  • Letztes Update: 22.06.2020
Profilbild von Kun Lu NLP, Data Science; Machine Learning; Text Mining; Python; Spark aus Munich
  • 7+ Jahre Erfahrung mit Natural Language Processing (NLP)
    • Erstellen von semantischen Graphen ie. Knowledge bzw. Ontology Graph 
    • Full-chain für NLP Aufgaben: Tagging, Parsing, Named Entity Recognition (NER), Relation Recognition
  • Statistische Analyse und Modellierung
  • Machine Learning: Scikit-Learn (sk-learn), Keras, Tensorflow
  • 9+ Jahre Erfahrung mit Python
  • Spark, Spark SQL, Spark ML, ElasticSearch mit Kibana (ELK)
  • Docker, AWS, Grafana Prometheus, GO, Ansible (indirekt anbieten)
  • 2017.09 - 2018.01 NLP specialist for an E-Commerce company
    • Analyze the text of product names, descriptions, etc. to improve search quality
    • Develop novel POS tagger. Based on Python (NLTK, Sklearn), Git, AWS Hadoop cluster
  • 2017.01 - 2017.09 Data scientist for an automobile company
    • Knowledge transfer
    • Product owner for a WebApp to improve data quality (R, Shiny); successfully released
      • Data visualization (Qliksense)
      • Reverse engineering of existing Visual Basic codes
      • POC for task automation to save enterprise cost
  • 2017.03 - 2017.06 In-house projects
    • Use Kafka to process real time data-streams
    • Forward Kafka’s output to Spark/Spark-ML for further processing: detect features of data, make statistical analysis and develop predictive models
  • 2016.08 - 2017.01 Data scientist in the Data-Analytics Group of metaFinanz GmbH
    • Specialized in web-mining and text-analytics
    • Crawl websites, apply text analytics techniques to extract information
    • Build dashboards using Shiny
    • Natural language processing for German: POS-tagging and stemming based on statistical inference; topic and sentiment analysis of the news
    • Project experience with IBM Watson: technical consulting on the architecture and principle of IBM Watson regarding the construction of domain specific ontology graph and bayersian inference. The use-case is automatic diagnostics in healthcare systems.
  • 2015.01 - 2016.07 SW Architect and Developer for my information retrieval project
    • Information retrieval and knowledge discovery with NLP and text-mining techniques;
      • Data source: scientific papers (20+ million papers)
      • Implemented in Python
      • Activities:
        • crawled online databases of scientific papers and online english dictionaries
        • constructed a way to identify concepts from raw text
        • calculated semantic and lexical statistics of the concepts
        • categorization of concepts using statistical inference
        • developed an long-short-term memory mechanism to find related concepts, hence obtaining a knowledge network
        • classification of papers on topics
      • Expertise: Text-Mining, natural language processing (NLP), statistical modeling, semantic
        search, regular expression
      • Demo: http://munich-datageeks.de/2016/09/13/kun-lu-text-mining-on-academic-
  • 2015.06 - 2015.12 Big-Data engineer (Professional Consultant), SHS Viveon
    • Member of the Data-Warehousing Team specialized in Big-Data
    • Participated in different intern and extern big-data projects:
    • integrating different data sources with Spark and Spark SQL, in oder to improve the ETL process
      • build Dashboard with Elasticsearch and Kibana
      • parse SAS-files with logstash
      • data extraction from contract fotos, written in Scala
      • optimize data models for business reporting
    • Tools/Techniques: Spark, Scala, Elastic Search, Logstash, Hadoop, MySQL, Microsoft SQL-Server, SSAS, Regular expression
  • Innerhalb von Deutschland:
    • Flexibel
    • 1 Tag pro Woche remote erwünscht aber nicht erfordert