my name is Paul. I am a California born, Bavarian based researcher and PhD student with teaching assignment in the field of computational linguistics. I am currently working with the amazing Stefan Evert and his computational corpus linguistics group at the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU).
Professional interest lies in everything concerning natural language processing in research as well as real world application, specifically identification and measurement of semantic textual similarity and polarity classification. During my studies, I focused on rule-based systems, performing the shift to statistical techniques in the recent years.
Currently, I am occupied with my PhD thesis which will explore the possibilities of refining semantic text clustering by applying techniques from text summarization and key-word extraction. This will turn out really handy for a number of applications, most notably the analysis of online polls or other digitally accessible text collections for the purpose of market research. For this endeavor, I coined the term Natural Language Condensation. For a quick overview about the most important buzzwords, check out this fancy flow chart. There is also a complete concept sketch with more detailed information about the thesis, should you be interested.
Non-professional interests are manifold, ranging from rock climbing to playing the guitar. Stuff I like as much as everybody else includes travelling distant countries, enjoying good food and watching the occasional sci-fi movie.
Feel free to browse these pages or drop me a line.
- Evert, Stefan, Paul Greiner, João Filipe Baigger, Bastian Lang (2016). “A distributional approach to open questions in market research”. In: Computers in Industry · Link · The accepted manuscript can be found as PDF around here.
- Proisl, Thomas, Stefan Evert, Paul Greiner, Besim Kabashi (2014): “SemantiKLUE: Robust Semantic Similarity at Multiple Levels Using Maximum Weight Matching.” In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). Association for Computational Linguistics and Dublin City University, 532–540 · PDF
- Evert, Stefan, Thomas Proisl, Paul Greiner, Besim Kabashi (2014): “SentiKLUE: Updating a Polarity Classifier in 48 Hours.” In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). Association for Computational Linguistics and Dublin City University, 551–555 · PDF
- Greiner, Paul, Thomas Proisl, Stefan Evert, Besim Kabashi (2013): “KLUE-CORE: A regression model of semantic textual similarity.” In: *SEM 2013: The Second Joint Conference on Lexical and Computational Semantics. Association for Computational Linguistics, 181–186 · PDF
- Proisl, Thomas, Paul Greiner, Stefan Evert, Besim Kabashi (2013): “KLUE: Simple and robust methods for polarity classification.” In: Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013). Association for Computational Linguistics, 395–401 · PDF
- Weber, Carsten, Johannes Handl, Stefan Reihl, Paul Greiner (2010): “The JSLIM 2.1 Documentation: Morphology, Syntax and Formal Languages.” In: CLUE Technical Reports (Number 10). Abteilung für Computerlinguistik Uni Erlangen-Nürnberg (CLUE).
- In cooperation with German market research agency Rogator AG, I have been involved in the conceptualization and implementation of The Klugator Engine (TKE). There is an article in Computers in Industry describing this project extensively. You can have a look at a copy of the accepted manuscript for free. TKE serves as template for the core engine of the commercially available text analysis software RogTCS. Check out the corresponding website for an elaborate description.
- I am currently occupied with my PhD thesis the working title of which is “A Human Readable Topic Clustering – An Applied Case Study on Natural Language Condensation”. This handy flowchart gives a first impression. For a more thorough introduction have a look at this comprehensive concept sketch.
- Others: Check out my list of publications and the corresponding links to get an idea about what I have been up to so far. The miscellaneous section might also provide material of interest.
My teaching with the computational corpus linguistics group Erlangen has been done in German so far. Should you be interested in English content, feel free to ask.
WS17/18: Web4Science – Aufbereitung, Visualisierung und
Präsentation wissenschaftlicher Ergebnisse mit Methoden
SS17: Grundseminar Programmierung Python
Inhalt: „Die Studierenden erlernen · Programmierkenntnisse in einer Programmiersprache, die zur effizienten Entwicklung und Anwendung sprachtechnologischer Lösungen geeignet ist (z.B. Python) · Das Entwickeln und Testen von Software · Die praxis- und forschungsorientierte Lösung computer- und korpuslinguistischer Problemstellungen unter Verwendung sprachtechnologischer Ressourcen“ – Auszug aus dem Modulhandbuch Studienfach Linguistische Informatik.
WS16/17: Arbeitstechniken der Computerlinguistik
Dieser Kurs ist Teil des Moduls „Grundlagen der Computerlinguistik I“, welches im Modulhandbuch Studienfach Linguistische Informatik ausführlich beschrieben wird.
- Should you be interested in even older stuff try browsing the FAU UnivIS for my name or just drop me a line.
Initially, I put up winking faces around here to mark entries that were not to be taken seriously. Then it became hard to identify content in between all those emojis, so I decided to do away with them. Viewer discretion is advised ;)
- Some nerd-cred: Very few people know what will happen beyond 100k. I am one of them.
- Ever thought about checking out my behind? I wouldn't know why, but enjoy the view!
- Temporarily, I was involved with the Fränkische Wörterbuch (Frankonian dictionary), taking care of backend processing. It is an astounding online resource and archive dealing with one of the most fascinating means of communication of all time: Frankonian.
- Just in case you were in doubt: I do know some German and might even have written a line or two.
- Of course, there is a good reason my thesis is yet to be finished. Originally, I was planning to post a picture of my kid here as cute excuse. Then I was like “well, this is the internet” and decided against it. Just know that he is the greatest.
- Some cool stuff made by people I am absolutely not acquainted with in no particular order: the one and only xkcd · a better nethack · some of my (local) heroes · yet another four letter comic · an operating system that happens to be agnostic towards operating systems · sth. sth. with cats O.o · incredibly credible world news · nothing to see here, just love the url · this seems so familiar · my favorite place to steal bib entries · even dinosaurs do this stuff · I really like this language, and here is why. · one of the many great reasons to love Frankonia · Now that I think of it, I actually might be acquainted with some of the folks behind the preceding links. Don't hold me responsible for whatever they might put on their pages, though.
Approach me via one of the following channels, should you want to get in touch. I am currently not on Facebook, Twitter or any other platform that is not listed below.
- E-mail: email@example.com
Office and mailing address
Professur für Korpuslinguistik
- Official pages of the university Erlangen-Nürnberg: FAU Germanistik · FAU Korpuslinguistik · FAU UnivIS (in this one, you will have to look up my name via the “Person” field)
- Networking service(s): LinkedIn
Looking forward to hearing from you.