Category Archives: work in progress

Corporate Genres…

Next week, I’ going to teach a seminar on English on the web at Potsdam University. I was invited by Prof. Dr. Barth-Weingarten because I had taught a seminar on “English in the New Media” there in 2014. The current seminar’s title is “English@Work” and it focusses on the use of English in professional settings. So, even though I’ve worked on web-based English a bit already, this seminar session will be quite a challenge – it’s “undiscover’d country” for me.

I’ve been working hard the last weeks to get ideas and a feasable plan for the seminar. Luckily enough, Jana Pflaeging allowed me to pick up on the structure we used for our seminar at Zagreb University, where we compared the two genres “ListSite” and “Personal Weblog” in group work. My idea for Potsdam now is to do the same with corporate websites and corporate blogs (mainly drawing on Poppi 2012 for the former and Puschmann 2010 for the latter).

I think I’ll pursue the question of how the challenge of creating a favourable corporate image for so many different recipients on the web is tackled on corporate websites and corporate blogs. I’ll show that corporate blogs address the need of companies to present themselves as interactive and accessible and, thanks to Puschmann’s previous work, we can also deal with a blog that only at a second glance turns out to be a corporate one (i.e. pretends to be something else) – whioch will be highly interesting 🙂 We will, therefore, compare some language features, participation frameworks, topics and functions – each aspect to be worked on by one group of students. Still, I am very excited and hope that the session turns out to be a good one…

In two weeks time, I’ll use the food for thought of this seminar session in a talk on corporate genres at Halle University – so I’m really looking forward to the new input I’ll get from the Potsdam students! 🙂


Book Published :-)

Peter Lang Personal Weblog CoverLast week, I received a parcel by the Peter Lang Verlag. Unfortunately, I was too excited to actually create an “unboxing video” (as described by Klaus Kerschensteiner in the first issue of  10plus1: Living Linguistics)… It contained the monograph The Personal Weblog: A Linguistic History that I had worked on whenever I had time in 2015.

It grew into more than a mere translation of my PhD-thesis – effectively, I wrote the book anew. And enjoyed it, as I had the feeling that I could write more freely after the content and the ideas had had some time to settle. The result is, at least I hope so, a readable monograph that is much shorter than my PhD thesis and that contains also a number of new ideas that hadn’t been developed at the time I wrote the PhD yet (e.g. actually mapping the prototypical distribution of features in diagrams that are based on statistics and, in fact, very much resemble Lemke’s (1999) theoretical sketches).

The book is now out for criticism and discussion – and I’m looking forward to both 🙂

EDIT: The book has been reviewed here 🙂


I am very thankful to so many people who have accompanied me on the way to this book. Therefore, I’d like to reproduce the acknowledgements here:

Peter Lang acknowledgements

Nightbus to Zagreb (with Jana Pflaeging)

In a few days, I’ll teach a seminar session on viral and non-viral genres with my colleague and friend Jana Pflaeging at Zagreb University. We’ll compare the genres “ListSite” (Jana published on it in 10plus1) and “Personal Weblog” on several layers. I’m very much looking forward to it – even though it’ll be quite stressful. We’ll take the nightbus from Munich on Monday, teach the seminar on Tuesday morning, and then return at 11pm to Munich by nightbus 🙂

Current Book Project “The Personal Weblog: A Linguistic History” and Peter Lang Nachwuchspreis

After I finished my PhD-thesis “Textsorten im Internet zwischen Wandel und Konstanz: Eine diachrone Untersuchung der Textsorte Personal Weblog” in June 2014, I immediately published it as open access version. For various reasons, I wrote my PhD in German. One of my first thoughts after publishing it was to turn it into a “proper” book – in English, this time (by “proper book” I mean, for instance, reducing the length from 450 to roughly 200 pages and cutting away typical dissertation rhetorics, orienting not towards examiners but an interested semi-expert audience). The working title is “The Personal Weblog: A Linguistic History”. So far, I have finished a chapter on genre theory (including genre change) and one of the two concluding chapters.

An additional motivation for continuing with this project comes from Peter Lang Verlag: My PhD and the planned English book based on it were awarded the Peter Lang Nachwuchspreis, which includes the coverage of all publication costs for a print and an ebook edition. The award came quite as a surprise but I am really grateful for the opportunity and the additional motivation it offers for my book project!

Feedback on Chapter 8: Textual Function

Last week, I my mentor gave me detailed feedback on chapter 8 of my thesis. All in all, the feedback was quite positive. The most important points to work on include:

  • The chapter could do with some restructuring. Unti the feedback session, I did not realise that some aspects that are discussed in point 8.3.1. of the chapter (functions as ethnocategories and their prototypicity) could tightly be linked to the first point, where some theoretical considerations on textual functions and a methodology for their analysis are offered together with a review of research on Weblogs’ functions (see Table of Contents May 2013).
  • In general, we started thinking about restructuring the thesis in order to increase readability. Chapter 2, that was actually intended to develop a genre model in detail, including discussions of the individual layers, might probably just serve as a rough sketch of the genre concept and its socio-cognitive components as well as the layers; the detailed discussions should be postponed until the first part of each analytical chapter. I hope that works out… I also started thinking about how I could shorten the thesis and be more concise in the end.
  • The ethnocategories should be discussed in more detail especially concerning the question whether they are really functionally determined or rather bundles of features containing a whole lot of structure as well. I tend to assume the latter, but should strengthen this aspect, especially because my chapter contains a whole lot of structural analysis as the basis of arguing for functional ascriptions.
  • Maybe some part of the structural analysis become redundant when chapter 7 is developed; then I could shorten the analyses a little bit.
  • My “entertainment” function is analysed as an appellative function urging readers to view a text as entertaining. That is, actually, meta-communicative and therefore situated on another layer than all the other functions… I should therefore treat the entertainment aspect in a different sub-chapter (and not next to advertisments, for instance).
  • Other functions, such as teasing or boasting could also be adressed in this rather small sub-chapter..

All in all, I start realising that I am reaching a point where I have to start thinking about the whole of the thesis again… so I am entering some phase of transition into the last stage of the thesis already. I hope I can manage the amount of work still before me till the end of the year…

Chapter 8: Multimodal Structure

After handing in chapter 7 (Textual Function) and while waiting for feedback on those nearly 80 pages, I started working on the chapter on multimodal structure. This chapter is basically a core linguistic one and should contain analyses on the following aspects:

  • Macro Structure:
    • layout of blog pages (header, sidebars, body etc.)
    • blog pages as part of a network of pages (about, pictures, homepage etc etc)
  • Meso Structure (is that a proper term?):
    • key elements of blog postings (meta links, tags, categories)
    • key elements of sidebars (meta links, blogrolls etc etc.)
    • thematic structure of blog postings (?)
  • Micro Structure:
    • language and image
    • register / style: key words, frequency counts, sentence and word length…
    • hyperlinks and their uses
    • topics, subtopics, topical coherence

As always, I did have a rough idea about what the chapter should deal with, but I did not know how to gather the necessary data. I did quite extensive research on corpus software, comparing the abilities of particular programs, always asking myself whether I could need what was offered.

I came across the program TreeTagger by Helmut Schmid (described in detail in Schmid 1994). This software can be used on .txt files and creates a vertical .txt file (one token per line) with a POS tag added to each token the tagger knows. Installing the program on windows is not easy for dummies as it was actually designed to run on LINUX and still needs the command shell. There is, however, also a graphical interface, which I tried out (of course) und which works quite well.

TreeTagger serves as POS Tagger only. M.Eik Michalke provides a software package – koRpus – which works within the R-framework. The koRpus package can tag .txt files using TreeTagger and afterwards do some frequency analyses on the text in question. As it is written by a psychologist, its focus lies on readability measures. As my knowledge of R is quite limited and (after everything had looked really promising for a while) became disappointed with the measures available (and especially with the way the data generated by the analyses is stored and made available for further use – I was not able to really figure that out, not even using the graphical R-interface RKWard), I decided not to use koRpus and look further for other software.

And I found: WordSmith, a software that offers the following (unfortunately, not on an open-source base as the R packages and therefore not for free…):

  • word lists, frequency analyses and measures such as sentence- and word length
  • key words in texts or groups of texts based on the word lists of single texts, key words can be compared with established corpora such as the BNC
  • concordances (even though I probably will not need those)

I was especially thrilled by the key word feature, as this makes possible to identifiy key topics when the 10 to 20 most frequent nouns in a text (or all texts of a period) are understood as indicators to the topics mostly dealt with. An example: I did a key word analysis on two texts of period one and found IT-words among the most frequent nouns. This was what I expected as the first weblog authors were mainly IT experts and their weblogs dealt (among other, more personal topics as in EatonWeb, for instance) with IT stuff, software, new links on the web, Apple vs. Microsoft and so on and so forth. I now hope to use this key word-tool for a broader analyses, aiming at extrapolating topical shifts across the periods.

So, currently I am working myself through all corpus texts again (330), doing the following steps (as always, I use SPSS for my statistics):

  1. I count the hyperlinks used in the entries. I differentiate between external links (the URL points to another domain), internal links (the URL remains within the same domain, links to categories, e.g. are internal as well), meta links (Permalinks, Trackbacks and Comment links, mostly at the end of postings; categories do not belong here and are counted as internal links as some period I weblogs already offer internal category links, but no other meta links. I also want to get neat data for the categories) and other links (mail:to, download etc.)
  2. I count other meso-structural features such as BlogRolls, guest books and so on. Maybe there are some trends that show after some counting…
  3. I determine a layout-type – Schlobinski & Siever (2005) suggested some and I extended their typology.
  4. I code the text in MAXQDA for special features like emoticons, rebus forms, oral features, graphostyle…
  5. I generate a pdf-file from the website which is imported to MAXQDA as well. This pdf-file is used for coding the language-image interplay and image types. Currently, I am doing some rough coding, intending to get more fine grained later on.
  6. I generate a .txt file with the postings of the weblog. This .txt file will later be used in WordSmith.

This procedure takes a while. As it is quite exhausting as well, I can only analyse around 20 texts per day. So that means around 6 weeks of work until I can move on to the WordSmith analyses and the language-image interplay (I’m really dreading that…).

Corpus Update

As I have pointed out in my first post, one comment about the diachronic corpus of Personal Weblogs my thesis is based on concerned the number of texts especially in the later periods (An outline of the corpus structure can be found in the talks “Anhything goes – everything done?” and “Stability, Diversity, and Change. The Textual Functions of Personal Weblogs”) People argued that a low number of texts was fine for period one, as there were only few weblogs around in these days. However, higher numbers of texts were expected for later periods as the access grew easier with more recent collection dates.

I have been thinking about these comments ever since, trying to find arguments for not extending the corpus. What I found, however, were quite weak excuses. Even more, I started wondering how I could justify a particular number of texts for a period in question at all. I came up with the following line of reasoning:

  • I work with both qualitative and quantitative methods, even though my general focus lies on the qualitative end of the continuum. Text numbers, therefore, have to be justified both from a qualitative and a quantitative point of view.
  • The qualitative framework of my thesis is heavily inspired by Grounded Theory (eg. following Glaser & Holton 2004). In Grounded Theory, there is a process called “Theoretical Sampling” combining data collection, coding and analysis. The basic idea is that data collection is guided by the emerging theory and strives for theoretical saturation. In other words: If nothing new is found, no conflicting cases, no cases challenging the categories established so far, the analyst has reached some point close enough to theoretical saturation to stop collecting samples. (footnote: He might as well have turned blind to new phenomena by excessive preceeding analysis. Anyway, further collection of samples would not help the research project in that case, either.) So that’s exactly my qualitative part of the argumentation: Collecting text samples until nothing new or challenging is discovered. This point had already almost been reached after collecting and analysing 80 to 90 texts for the periods II.A to II.C, but it was good to put my categories to the test by collecting more texts and assimilating them into my theory.
  • From a quantitative point of view, a researcher has to make some kind of informed guess on how many cases will probably be enough to make some statistically sound statements. One formula suggested by Raithel (2008: 62) uses the number of variables to be joint in one analytical step (e.g. a correlation study of two variables) and associated features (e.g. two features for the variable “gender”) ; this value is multiplied by 10: n >= 10 * K^V As I try to trace the change within several variables which are investigated apart from each other, my analytical steps quite often only contain one variable with a particular number of features. The variable with the highest number of features at present is the textual function with about ten distinct features (e.g. Update, Filter, Sharing Experience as outlined in my last post. Consequently, about 100 texts per period are roughly enough according to this formula. This is quite a tight budget; if I want to correlate the variable “textual function” with the variable “gender of author” I have to point out that the results give some hint at a possible statistical connection but have to be taken with a pinch of salt.

I think that both arguments taken together form a fairly stable basis for the justification of the number of cases. I guess 100 texts in the periods II.A, II.B and II.C are also a good compromise between striving for ever higher case numbers and the feasability of qualitatively and thoroughly analysing, say, 500 texts in each period.

So, after the extension phase that took me a bit more than one week of searching for texts, coding, basically repeating all analytical steps I had done before and updating the numbers in my thesis, the corpus looks like that now (snapshot from my screen, sorry for the quality):


Thoughts on chapter 7 again: primarily informative functional patterns

I’ve been working on chapter 7 of my thesis (textual functions of Personal Weblogs) for the last week and a half. Work is going well, even though I’m a bit worried about my time management and the amount of space this chapter will probably occupy in the end.

So far, I have finished a research review, the methodology and some  functions. As I have pointed out in this post, I basically present ethnocategories such as update, filter, or sharing experience and the linguistic descriptions of postings that belong in each category. Thus, I hope to blend the benefits of ethnographic studies (such as Nardi et al. 2004a and 2004b, Reed 2005, Brake 2007, Baumer et al. 2008) and detailed linguistic analysis. The advantage of combining two methodologies (apart from gaining a clearer insight into what actually characterises the different functions, i.e., how they are actually realised in Personal Weblogs) lies, in my opinion, in the opportunity of generating functional categories on the basis of linguistic analysis which are not mentioned explicitely / only vaguely by bloggers or by ethnographic studies, respectively. In order to be able to present categories without ethnographic counterpart (and for reasons of legibility), I have decided on presenting the functions arranged in the following groups (inspired by Brinker’s works):

  1. primarily informative functions
  2. primarily appellative functions
  3. primarily contact-oriented functions
  4. functions focussed on benefits of the writing process

I am currently working on informative functions. The chapter is structured like this:

  1. filter
  2. update
  3. sharing experience
  4. further primarily informative functions

Subchapter 4 is what I am currently focussed on. It is not as easy and clear-cut as the first three subchapters. I think, it should include the following functional patterns:

  • informing about external topics (cf. Puschmann 2009, 2010)
  • voicing opinions
  • review
  • giving advice

Today, I have covered the first point. I have discovered that it includes actually two patterns: First of all, postings that mimic a newspaper-like style and seem to belong into the category “journalistic blogging”. Secondly, postings aimed at some kind of knowledge transfer from experts to interested laypeople. I am not sure whether I should seperate these patterns, but I guess – as the postings of these groups look quite differently and the functions “providing the latest news” vs. “transferring expert knowledge in an understandable way” are distinct enough to treat them as different patterns.

While writing this, I realised that the function “knowledge transfer” is quite close to “giving advice” as the latter is some kind of knowledge transfer with the special twist of providing instructions. “Giving advice” also exhibits certain overlaps with “sharing experience” as the advice given in Personal Weblogs is often nothing else than knowledge gained by experience. I think, I should explicitely state these overlaps and use them for smoothely guiding the reader through the subchapter….

Tomorrow, I will make sure to split the first category of the subchapter (informing about external topics) into the two subcategories just mentioned. From there, I will continue with “giving advice” in order to provide a smooth transition. What a plan!

Outlook: I am still thinking about the quantification part of the chapter. How does a correlation study work? My idea is to create variables for each function in SPSS to be able to state for each weblog whether the specific function could be detected. I would like to use these variables for some sort of correlation to answer the question of which functions do usually co-occur in weblogs and whether functional clusters can be detected. I will give this some more thought and come back to it in the next post.

Thoughts about: micro genres of postings

I’m currently working on chapter 7 of my thesis (textual functions of Personal Weblogs). I have identified several functional ethnocategories such as filter, update, sharing experience, review and so on. Additionally, I have employd theoretical codes to capture functional patterns not explicitally termed by the community (e.g. several contact functions; appellative patterns etc.).

Even before the last conference, a thought struck me: Actually, what I’m doing now is a description of posting genres. Each functional pattern can be differentiated from others by structural, functional and contextual features. The one and only layer which remains constant is the form of communication. So we can say that stable patterns / genres of postings have been established within the blogging community. The Personal Weblog as ethnocatgorial genre picks from those posting genres and thus establishes a functional set.

Another argument for the status of micro genres is the following: Working on the filter function, I realized that there are several ways of carrying out realising the filter function. A neutral, matter-of-fact way (note to self: include PeterMe Perfect from period I in analysis!!!), a more author-centric way and a humorous way which plays with the established patterns. Therefore, we can assume that the posting genres each have a certain scope of variability. Variation  is a central characteristic of genres (cf. Brock 2009, Giltrow & Stein 2009, Lemke 1999, Santini et al. 2011, Swales 1990 etc etc.) So is we can establish several sub-patterns for the micro genres or at least describe a range of variation, this, too, is a good indicator, in my opinion, for their genre status…

Two conferences in three weeks…

The last three weeks were quite exhausting, exciting and in general a thrilling experience for me as a doctoral student.

In February, I had the opportunity of presenting my diachronic corpus of Personal Weblogs to an audience of media linguists and communication scientists on the conference of the DGPuK section “Mediensprache”. The focus of the talk was my methodology of collecting corpus candidates and selecting those that were added to the corpus. I also presented some ideas about the use of images in Personal Weblogs. The slides and the manuscript of the talk can be found on the “publication”-page.

The feedback was quite positive. Michael Klemm suggested conducting interviews, especially concerning the question of media selection – the choice between a weblog, facebook, twitter and other forms of communication. I am thinking about this suggestion; probably I won’t have the time and space to include that in my doctoral thesis. I guess I should focus on the material I have gained from analysing the metablogging in my corpus texts. However, it might be a good idea to mention the idea of conducting interviews as matter of further research in my conclusion-section.

Another comment concerned the size of my corpus, in particular the 80 texts in period II.C. I should be aware that people will always ask why there is a particular number of texts, why not more, why not less. I am thinking of extending the corpus to 100 texts per period in part II. This entails a lot of work; however, according to my estimation formula (I use Raithels (2008: 62) formula n>=10*K^v with K being the number of features per variable and v being the number) 100 texts are a safe number to work with as all my variables do not have more than 8-ish different features and my study does not need to look at more than 2 variables simultaneously. Be that as it may, I find this insisting on numbers a bit frustrating. I mean, I DO have 80 texts per period II.B and II.C and even 93 for period II.A. And I DO work with a sheer flood of examples from those texts – so why is that not enough to describe some patterns and their change(s)?

Another, very interesting suggestion was that of a connection between media development and topics – never thought about the fact that fashion blogs came into existence because of the ease of embedding images! Thanks to Christof Barth (Trier University) for that idea!

My second talk was last weekend (14th NLK) and dealt with the textual functions of the Personal Weblogs in DIABLOK. I presented my methodology – a combination of Grounded Theory-style content analysis (Glaser & Holton 2004; Mayring 2010) and linguistic analysis à la Klaus Brinker (1983, 2000, 2010). I basically work with ethnocategories here – so I try to find out what bloggers say they do functionwise and analyse these functional patterns linguistically. I suggested functional patterns called Update, Filter, and Sharing Experience.

My mentor Alexander Brock, who was also present at the conference, was not quite content with the names of the functional patterns, especially regarding the Update-function. I am not sure whether I get him right: His point is that “Update” actually only concerns a special kind of information structure, a ratio of new and old information. In my opinion, “Update” is a functional pattern that the blogging community has termed like that and which can be recognized by structural, contextual and functional features (see my slides for examples).

Our compromise, however (even though it might be the result of a misunderstanding) is quite a useful one: My mentor suggested not to present all the ethnocategories as seperate sub chapters but rather group them according to their dominating functional component. So there will be sub chapters on informational, appellative, and contact functions as well as on production-oriented functions (thinking by writing, releasing emotional tension, creative expression).

Apart from that, I got a highly interesting comment about the DarkNet with its utter anonymity and a possible comparison of my Personal Weblogs with the textual patterns to be found there. Thank you, Marco, for that – I will definitely follow this trace some day!