Interactive Labeling of Brands in Print Advertisements
- Subject:Bachelor’s or Master’s Thesis with the goal to design and develop an interactive labeling system for identifying brands in advertisements from scanned newspaper archives.
- Type:Bachelor / Master Thesis
- Add on:
As the digitization of the worlds libraries and print archives continues steadily, the demand for automated processing of such documents grows. Hereby, resarchers and practitioners would like to digitally process such documents with tools from computer vision (CV) and optical character recognition (OCR). Further they would like to search and filter for certain document meta-data. However, all of this presumes the availablity of such extracted features and meta-data. As state-of-the-art machine learning (ML) classifiers still do not reach desired accuracy levels, especially on old documents or those from fringe contexts, manual labeling effort is required.
For the scope of this thesis, we limit the context to identifying the brands in advertisements from scanned pages of newspapers and magazines. This poses an interesting use-case for, for instance, advertising researchers. Associated colleagues at the University of Mannheim (UniMA) have already roughly extracted the brands of advertisements in the US magazine "The Economist", ranging from the 1840s to today. Hereby they used OCR to arrive at a simple representation of the advertising brand. We expect a thesis student to develop an interactive labeling system in order to support the extension of this brand identification to arrive at a cleaner representation. Interactive labeling hereby strives to combine automatic steps (e.g. the trained model) with incremental user input. The work-packages entail:
- analyzing the state-of-the-art of such instance identification tools (potentially by conducting a structured literature review)
- exchange with the researchers at UniMA regarding their needs and requirements
- development of an interactive labeling system as part of a design science research process
- writing a thesis document according to research group requirements & participation in our thesis colloquium
Design science research is a well established methodology in the information systems field, which deals with the scientific view on artifacts, such as the labeling system that should be developed during this thesis. Hereby so called design knowledge can be derived from the development process and the finished artifact.
We expect the student to be familiar with web development. The system should be developed with a modern web application frontend framework or be forked from an existing open source labeling system. Further we expect the backend to be based on standard Python frameworks. Experience in this regard is required as well.
If you are interested in this topic and want to apply for this thesis, please contact Merlin Knäble with a short motivation statement, your CV, and a current transcript of records. Feel free to reach out beforehand if you have any questions.