Skip to content

Patrick Steinert Posts

Updates KW 25

Hey Leute, vergangene Woche war vollgepackt mit spannenden Projekten und Aktivitäten. Hier ein kurzer Überblick: Tech-Projekte: RAG mit Langchain: Ich habe endlich Zeit gefunden, Retrieval-Augmented Generation (RAG) mit Langchain zu testen. Die Evaluierungen waren recht aufschlussreich, und ich bin begeistert von den Möglichkeiten, die sich dadurch für meine Projekte eröffnen. llama.cpp: Parallel dazu habe ich mich mit llama.cpp beschäftigt. Es ist faszinierend zu sehen, wie effizient diese Implementierung große Sprachmodelle auf Consumer-Hardware laufen lässt. Whisper OpenAI Plugin: Ein echtes Highlight war die Integration des Whisper OpenAI Plugins für Information Retrieval. Ich habe damit gleich ein 256 Metaverse Dataset indexiert – die Ergebnisse sind vielversprechend! Sonstiges: Sport: Trotz des vollen Terminkalenders habe ich es geschafft, regelmäßig Sport zu treiben. Es hilft ungemein, den Kopf frei zu bekommen. Kleinanzeigen: Ich habe mich endlich dazu durchgerungen, ein paar Sachen zu verkaufen, die ich nicht mehr brauche. Überraschend, wie viel sich da über die…

Leave a Comment

Five Levels of Autonomous Coding

A few weeks ago, we had a brainstorming session to challenge the statement: “In 2026, simple coding of business software by a human is unprofitable.” It quickly dawned on me that for this prediction to hold, we would need fully autonomous coding or at least a high degree of automation. This concept immediately reminded me of the various levels of autonomous driving—Eureka! Of course, I wasn’t the first to make this connection; someone on the internet had brilliantly mapped these levels from driving to coding.
Let’s dive into these levels to understand better how they might apply to the future of coding:

Level 1: Assisted Coding

  • What Happens: Coders handle the bulk of the work but can request autogenerated code snippets to copy-paste or use as code completion.
  • Responsibility: Coders must validate and are ultimately responsible for all code, ensuring accuracy and functionality.

Level 2: Partly Automated Coding

  • What Happens: Coders primarily use the IDE to specify features, and the AI then modifies the code accordingly.
  • Responsibility: While the AI handles some coding, coders must validate all changes and remain responsible for the final output.

Level 3: Highly Automated Coding

  • What Happens: Coders use a more advanced interface, not limited to traditional IDEs, to specify features. AI can automatically handle specific tasks like fulfilling software tests, generating test code, reorganizing code for better maintainability, creating new user interface features, and proposing and testing solutions to errors.
  • Responsibility: Coders intervene in exceptional cases or when errors arise that the AI cannot resolve.

Level 4: Fully Automated Coding

  • What Happens: The developer’s role shifts more towards a Product Owner’s. AI can code features based on detailed specifications and autonomously handle errors—making adjustments, testing, and waiting for developers to review and commit changes.
  • Responsibility: The AI provider assumes a significant portion of the responsibility, especially in maintaining the integrity and functionality of the code.

Level 5: Autonomous Coding

  • What Happens: AI handles everything from coding new features based on persistent specifications to upgrading dependencies and fixing errors. It manages the full lifecycle of the code, including deployment.
  • Responsibility: AI becomes largely self-sufficient, significantly reducing the need for human intervention.
Progress toward these levels raises intriguing questions about the future role of human programmers. Will the specifications themselves not be in traditional code? Possibly. They may be in a more human-understandable form that can be translated directly into machine code, with the compiler doing most of the verifying of the machine code. Unlike human language, which can be ambiguous and harder for compilers to validate, this system promises greater precision and efficiency.
As we look to a future where coding is increasingly automated, it’s fascinating to consider how these changes will redefine the landscape of software development. It’s not just about the technology; it’s about how we adapt to these tools to ensure that they enhance our capabilities without displacing the creative and critical elements that define good software development. What do you think? Are we heading toward a world where coders are more supervisors and reviewers than active coders? The conversation is just beginning, and your insights are more valuable than ever!
Leave a Comment

My First Year as a part-time PhD Student

… A Journey into Multimedia Information Retrieval and the Metaverse Hello everyone! I can’t believe it’s already been a year since I embarked on my PhD journey. Time truly flies when you’re engrossed in research, and what a year it’s been! Today, I want to share with you some of the highlights, challenges, and learnings from my first year as a PhD student, focusing on my research project in Multimedia Information Retrieval (MMIR) and its intersection with the Metaverse. The Research Project: MMIR Meets the Metaverse When I started my PhD, I was fascinated by the untapped potential of Multimedia Information Retrieval. MMIR is all about searching and retrieving multimedia data like images, videos, and audio. But I wanted to take it a step further. I was intrigued by the burgeoning Metaverse—a collective virtual shared space created by the convergence of virtually enhanced physical reality and interactive digital spaces. The…

Leave a Comment

Integration of Metaverse and Multimedia Information Retrieval

Diving into the vibrant intersection of the Metaverse and Multimedia Information Retrieval (MMIR), we uncover a fascinating journey that’s shaping the future of Metaverse integration with MMIR. Imagine stepping into a universe where the boundaries between physical and digital realities blur, creating an immersive world teeming with multimedia content. This is the Metaverse, a collective virtual space, built on the pillars of augmented and virtual reality technologies.

At the heart of integrating these worlds lies the challenge of efficiently indexing, retrieving, and making sense of a deluge of multimedia content—ranging from images, videos, to 3D models and beyond. Enter the realm of Multimedia Information Retrieval (MMIR), a sophisticated field dedicated to the art and science of finding and organizing multimedia data.

The research explored here, as my Ph.D. project, ventures into this nascent domain, proposing innovative frameworks for bridging the Metaverse with MMIR. Their work unveils two primary narratives: one, how we can leverage MMIR to navigate the vast expanses of the Metaverse, and two, how the Metaverse itself can generate new forms of multimedia for MMIR to organize and retrieve.

In the first scenario, imagine you’re an educator in the Metaverse, looking to build an interactive, virtual classroom. Through the integration of MMIR, you can seamlessly pull educational content—be it historical artifacts in 3D, immersive documentaries, or interactive simulations—right into your virtual space, enriching the learning experience like never before.

The second scenario flips the perspective, showcasing the Metaverse as a prolific generator of multimedia content. From virtual tours and events to user-generated content and beyond, every action and interaction within the Metaverse creates data ripe for MMIR’s picking. This opens up a new frontier for content creators and researchers alike, offering fresh avenues for creativity, analytics, and even virtual heritage preservation.

Navigating these possibilities, the research present sophisticated models and architectures, such as the Generic MMIR Integration Architecture for Metaverse Playout (GMIA4MP) and the Process Framework for Metaverse Recordings (PFMR). These frameworks lay the groundwork for seamless interaction between the Metaverse and MMIR systems, ensuring content is not only accessible but meaningful and contextual.

To bring these concepts to life, let’s visualize a diagram illustrating the flow from multimedia creation in the Metaverse, through its processing by MMIR systems, to its ultimate retrieval and utilization by end-users. This visualization underscores the cyclical nature of creation and discovery in this integrated ecosystem.

In essence, this research lights the path toward a future where the Metaverse and MMIR coalesce, creating a symbiotic relationship that enhances how we create, discover, and interact with multimedia content. It’s a journey not just of technological innovation, but of reimagining the very fabric of our digital experiences.

Let’s create an image to encapsulate this vibrant future: Picture a vast, sprawling virtual landscape, brimming with diverse multimedia content—3D models, videos, images, and interactive elements. Within this digital realm, avatars of researchers, educators, and creators move and interact, bringing to life a dynamic ecosystem where the exchange of multimedia content is fluid, intuitive, and boundlessly creative. This visualization, rooted in the essence of the research, will capture the imagination, inviting readers to envision the endless possibilities at the intersection of the Metaverse and MMIR.

Leave a Comment

Neue Horizonte im E-Commerce: Wie KI die Spielregeln verändert


KI ist im E-Commerce ein alter Hut. Recommendations, Prognosen, Kundensegmentierung – die Use Cases gibt es schon ewig. Die neuen AI-Technologien sind dennoch ein Game-Changer und verändern den Digital Commerce, da bin ich sicher. Es gibt aber Unternehmen, die sind besser vorbereitet als andere und so wird sich schnell zeigen, wer die Möglichkeiten als Vorteil einsetzen kann – und wer nicht.

1 Comment

256 Metaverse Records Dataset

The dataset was created to explore the use of meatverse virtual worlds and evlauate performance of feature exraction methods on Metaverse Recordings.

I’m thrilled to announce the availability of the 256-MetaverseRecords Dataset, a dataset for experiments with machine learning technology for metaverse recordings. This dataset represents a significant step forward in the exploration of the integration of virtual worlds in Multimedia Information Retrieval.

The dataset was created to explore the use of meatverse virtual worlds and evlauate performance of feature exraction methods on Metaverse Recordings. The dataset contains 256 video records of user sessions in virtual worlds, mostly based on screen recordings.

Leave a Comment

Update KW 50/23

Mal wieder ein kleines Update zu allem möglichen…


  • Diss Progress: Es geht voran, nach einer Phase mit mehr organisatorischen Themen geht es auch wieder mit der Forschung voran. Grundlagen und Rechercheergebnisse sind vorhanden und müssen zu Papier gebracht werden. Mein nächstes Conference Paper ist auch in trockenen Tüchern. Februar, Bali, aber nicht vor Ort.
  • Time Management: Ein Thema, dass mich schon seit einiger Zeit beschäftigt. Sich eine Übersicht zu verschaffen, wo man wie viel seiner Zeit investiert, ist super wichtig. Ich habe dazu mal eine Methode von Aivars Meijers aus seinem YT Video ausprobiert. Empfehlung! Recht einfach und ohne viel Details.
  • Side Hustle: Mehrere Einkommensströme zu schaffen bzw. zu erhalten war ein Ziel für dieses Jahr. Dabei konnte ich meine Dozententätigkeit mit meiner Promotion verbinden, was zumindest den bisherigen thematischen Spagat eliminiert. Im vergangenen Jahr habe ich zu Veranstaltungen zu IoT und Cloud Computing geleitet. Thematisch spannend, aber leider relativ weit weg von der Diss und damit eine zusätzliche Belastung. Wie auch immer, ich möchte aber gerne auch passive Einkommen aufbauen. Dazu habe ich mir ein paar Gedanken gemacht und Experimente aufgesetzt.
  • Threads: Passend dazu gibt es jetzt auch Threads in der EU. Ich nutze dieses neue Netzwerk für eines meiner Experimente. Skill2Lead.

Leadership Insights

  • 1on1: Was diese Woche mal wieder eine wichtige Erkenntnis für mich war: 1on1-Termine mit dem Team sind ein wichtiges Mittel, um im vertrauten Rahmen die Stimmung zu erfassen und gemeinsam Themen wie Fortbildungen, Mitarbeit, oder Verhalten zu besprechen und zu vereinbaren.

World of AI

  • GenAI ist weiterhin ein krasses Tool. Aber gefühlt wird es immer schwerer, aus ChatGPT ein vernünftiges Ergebnis zu bekommen. Ich verwende dazu gerade mit Vorliebe den Kritik-Hack: nach dem Ergebnis diesen Prompt verwenden “Please critique the above response. Then based on the critique, output the full improve response.”

Have a good next week!

Leave a Comment

Award Winning Graphical Abstract and Paper Presentation at IEEE MetroXRAINE 2023

Early in October, I contributed to the IEEE MetroXRAINE 2023 in Milano Italy. With my work, I presented the approach for my PhD research. The paper is titled “Towards the Integration of Metaverse and Multimedia Information Retrieval”. In a nutshell, integrating the metaverse with Multimedia Information Retrieval (MMIR) can be grouped into at least two cases: metaverse used MMIR and MMIR processes metaverse produced multimedia. However, my research concentrates on the integration of metaverse-produced content in MMIR. But more on this in another post. In the submission, I sent a graphical abstract, and hey, it was awarded!


IEEE MetroXRAINE is an interdisciplinary conference on the fields of metrology, Extended Reality, Artificial Intelligence, and Neural Engineering. I tried an EEG-based brain interface for a game. It is an awesome experience, and I’m excited to see more of this technology in the future. But for sure, I need to concentrate more on my research!

Leave a Comment

Big Data Week & AWS User Group Bonn

In the past weeks, I experienced the speaker’s life. I was visiting the IBC in Amsterdam for a few days, presenting the company I’m working for. Early in October, I presented a talk about AI use cases in the Media Landscape at the Big Data Week in Bucharest, Romania. And last week, I gave a similar talk at home, for the AWS User Group in Bonn.

While talking about your thoughts, models and experiences is cool, the bigger stages still make me nervous—however, practice and experience (aka training) help a lot.

AI is a hot topic, not only generative AI, but more and more of the business in media and elsewhere is driven by data. Customer experiences are customized by AI, manual effort vanishes through AI, and even content is created by AI. In Multimedia research, we say “Multimedia is everywhere”, but I think we can state “AI is everywhere”.

Leave a Comment