Essay Award “Bildung Digital” II

This is a quick update to the previous post in which I announced that I had won the second prize in the Stifterverband’s essay competition “Bildung heute – Bildungsideal einer digitalen Zeit”. The essays (1st, 2nd, 3rd) are now available online as audio reading and as pdf file (here’s mine).

After having the pleasure of listening to Maria reading her winning essay, I can only recommend her text. I have seldom come across a text with such depth, clarity and elegance at the same time. Congratulations!

The Post PhD Blues

I haven’t reblogged anything here yet but this post was so tremendously to the point that I just couldn’t resist..The Post-PhD-Blues – familiar to me, as well.

The Thesis Whisperer

This post is written by Brian Flemming, a mathematician working as a Systems Engineer in Edinburgh. He has recently completed an Engineering Doctorate (EngD) as a mature student at Heriot Watt University, which he found an intensive and enjoyable experience, and which he credits with greatly increasing the effectiveness and authority of his work. He is now appreciating the freedom to continue studying and spend time away on the hills, without the associated "PhD-guilt" of neglecting the books …

When Brian sent me this post I could instantly relate. In fact, this blog is the outcome of my own PhD blues where I needed something meaningful, creative and interesting back in my life. I know many people who have finished and express similar sentiments. Here are Brian's thoughts.

One of the posts that caught my eye recently commented on the career prospects for the newly-qualified PhD, especially outside academia.

Essay-Award “Bildung Digital”

At the end of 2014, I participated in a competition which was launched by the Stifterverband Deutsche Wissenschaft, the Hochschulforum Digitalisierung and the initiative Was bildet Ihr uns ein?. Participants were asked to write an essay on the Bildungsideal in the digital age. My text approaches the issue in a creative way, using the setting of a well-known German fairy-tale for the compilation of a whole wishlist of aspects this educational ideal should comprise. The essay was awarded the second prize, which means that I’m invited to present it in Berlin on 2nd February this year – how exciting! The first 12 essays of roughly 90 submissions will be published on- and offline. I will provide a link here as soon as possible.

Current Book Project “The Personal Weblog: A Linguistic History” and Peter Lang Nachwuchspreis

After I finished my PhD-thesis “Textsorten im Internet zwischen Wandel und Konstanz: Eine diachrone Untersuchung der Textsorte Personal Weblog” in June 2014, I immediately published it as open access version. For various reasons, I wrote my PhD in German. One of my first thoughts after publishing it was to turn it into a “proper” book – in English, this time (by “proper book” I mean, for instance, reducing the length from 450 to roughly 200 pages and cutting away typical dissertation rhetorics, orienting not towards examiners but an interested semi-expert audience). The working title is “The Personal Weblog: A Linguistic History”. So far, I have finished a chapter on genre theory (including genre change) and one of the two concluding chapters.

An additional motivation for continuing with this project comes from Peter Lang Verlag: My PhD and the planned English book based on it were awarded the Peter Lang Nachwuchspreis, which includes the coverage of all publication costs for a print and an ebook edition. The award came quite as a surprise but I am really grateful for the opportunity and the additional motivation it offers for my book project!

eJournal Launched

The eJournal I have mentioned in the previous post – 10plus1: Living Linguistics – has finally been launched! In December 2014, my colleague Jana Pflaeging and I took the final step and made the journal public. The resonance so far has been overwhelmingly positive. Currently, we are working on 10puls1’s editorial board, which literally grows every day. As of today, the following scholars have agreed to join the 10plus1 team:

  • Gerd ANTOS, Universität Halle-Wittenberg
  • Matthias BALLOD, Universität Halle-Wittenberg
  • Manuel BURGHARDT, Universität Regensburg
  • Bettina M. BOCK, Universität Halle-Wittenberg / Universität Leipzig
  • Alexander BROCK, Universität Halle-Wittenberg
  • Christine DOMKE, Technische Universität Chemnitz
  • Ulla FIX, Universität Leipzig
  • Martin LUGINBÜHL, Université de Neuchâtel
  • Christian PENTZOLD, Technische Universität Chemnitz
  • Jana PFLAEGING, Universität Halle-Wittenberg / Universität Salzburg
  • Jan Oliver RÜDIGER, Universität Kassel
  • Peter SCHILDHAUER, Universität Bielefeld / Universität Halle-Wittenberg
  • Marion SCHULTE, Universität Bielefeld
  • Martin SIEFKES, Universität Bremen
  • Hartmut STÖCKL, Universität Salzburg
  • Janina WILDFEUER, Universität Bremen

We are looking forward to the contributions to the first issue!

The eJournal-project

My colleague Jana Pflaeging and I have started thinking about launching an eJournal. What an introductory sentence… No on to the content:

Our idea is a cross-cutting journal addressing a particular linguistic problem or subdiscipline with each issue. We hope to set ourselves apart from the many journals with narrow topical foci which are already in existence. By cross-cutting I also mean: Each issue is meant to display a cross-section of the scholarly community – from professors on tenure to post-docs, doctoral students and students working on excellent papers. We hope to offer a publication platform especially to younger researchers and to enable them to publish in a peer-reviewed journal on work-in-progress, preliminary results and fresh, innovative ideas. The journal will be their megaphone inviting responses, critical comments and helping them to extend their network.

Jana has found an excellent article by Dr. Gerry Coulter, editor of the International Journal of Baudrillard Studies. His article reports his experiences as an editor – from the launch of the ejournal until the present day. As crucial factors of success, he identifies (Coulter 2010: 2):

  1. a good idea
  2. a strong organizing force
  3. the ability to gain acceptance among experts in the field
  4. excellent content
  5. a plan for publicizing the launch
  6. appropriate ressources
  7. a long-term plan for managing the journal

Provided our organizing force is strong enough, there are still a couple of things to sort out according to this list:

  1. We have to present our idea to a number of scholars to get feedback (as Coulter did).
  2. Acceptance among experts in the field is a difficult point for us, as we do not address a particular field but rather a special type of authors (not only renowned scholars but also “newbies”) and special types of articles (work in progress, dialogues, synopses etc.). The consequence is to look for the potential audience and ask them for feedback – but who could that be? Will settled scholars be interested in this kind of content? Will students be?
  3. Ressources are the next most pressing problem: Will our university provide us with server space? Will we have to apply for funds or can we count on free-to-use software, for instance? In other words: Can we afford the project, both in terms of time and money?
  4. Making the journal known and accepted. One step towards this is applying for an ISSN. This does not seem to be THAT difficult as there are national ISSN agencies to which the application can be addressed. Guidelines and the offical ISSN-website provide easy-to-use step-by-step information.
  5. The future: Laying out a plan not only for the next year but for – say- 10 years.

In sum, more questions open than answered and many things to tick off from our list until the journal can be launched – hopefully in March 2015.

Feedback on Chapter 8: Textual Function

Last week, I my mentor gave me detailed feedback on chapter 8 of my thesis. All in all, the feedback was quite positive. The most important points to work on include:

  • The chapter could do with some restructuring. Unti the feedback session, I did not realise that some aspects that are discussed in point 8.3.1. of the chapter (functions as ethnocategories and their prototypicity) could tightly be linked to the first point, where some theoretical considerations on textual functions and a methodology for their analysis are offered together with a review of research on Weblogs’ functions (see Table of Contents May 2013).
  • In general, we started thinking about restructuring the thesis in order to increase readability. Chapter 2, that was actually intended to develop a genre model in detail, including discussions of the individual layers, might probably just serve as a rough sketch of the genre concept and its socio-cognitive components as well as the layers; the detailed discussions should be postponed until the first part of each analytical chapter. I hope that works out… I also started thinking about how I could shorten the thesis and be more concise in the end.
  • The ethnocategories should be discussed in more detail especially concerning the question whether they are really functionally determined or rather bundles of features containing a whole lot of structure as well. I tend to assume the latter, but should strengthen this aspect, especially because my chapter contains a whole lot of structural analysis as the basis of arguing for functional ascriptions.
  • Maybe some part of the structural analysis become redundant when chapter 7 is developed; then I could shorten the analyses a little bit.
  • My “entertainment” function is analysed as an appellative function urging readers to view a text as entertaining. That is, actually, meta-communicative and therefore situated on another layer than all the other functions… I should therefore treat the entertainment aspect in a different sub-chapter (and not next to advertisments, for instance).
  • Other functions, such as teasing or boasting could also be adressed in this rather small sub-chapter..

All in all, I start realising that I am reaching a point where I have to start thinking about the whole of the thesis again… so I am entering some phase of transition into the last stage of the thesis already. I hope I can manage the amount of work still before me till the end of the year…

Chapter 8: Multimodal Structure

After handing in chapter 7 (Textual Function) and while waiting for feedback on those nearly 80 pages, I started working on the chapter on multimodal structure. This chapter is basically a core linguistic one and should contain analyses on the following aspects:

  • Macro Structure:
    • layout of blog pages (header, sidebars, body etc.)
    • blog pages as part of a network of pages (about, pictures, homepage etc etc)
  • Meso Structure (is that a proper term?):
    • key elements of blog postings (meta links, tags, categories)
    • key elements of sidebars (meta links, blogrolls etc etc.)
    • thematic structure of blog postings (?)
  • Micro Structure:
    • language and image
    • register / style: key words, frequency counts, sentence and word length…
    • hyperlinks and their uses
    • topics, subtopics, topical coherence

As always, I did have a rough idea about what the chapter should deal with, but I did not know how to gather the necessary data. I did quite extensive research on corpus software, comparing the abilities of particular programs, always asking myself whether I could need what was offered.

I came across the program TreeTagger by Helmut Schmid (described in detail in Schmid 1994). This software can be used on .txt files and creates a vertical .txt file (one token per line) with a POS tag added to each token the tagger knows. Installing the program on windows is not easy for dummies as it was actually designed to run on LINUX and still needs the command shell. There is, however, also a graphical interface, which I tried out (of course) und which works quite well.

TreeTagger serves as POS Tagger only. M.Eik Michalke provides a software package – koRpus – which works within the R-framework. The koRpus package can tag .txt files using TreeTagger and afterwards do some frequency analyses on the text in question. As it is written by a psychologist, its focus lies on readability measures. As my knowledge of R is quite limited and (after everything had looked really promising for a while) became disappointed with the measures available (and especially with the way the data generated by the analyses is stored and made available for further use – I was not able to really figure that out, not even using the graphical R-interface RKWard), I decided not to use koRpus and look further for other software.

And I found: WordSmith, a software that offers the following (unfortunately, not on an open-source base as the R packages and therefore not for free…):

  • word lists, frequency analyses and measures such as sentence- and word length
  • key words in texts or groups of texts based on the word lists of single texts, key words can be compared with established corpora such as the BNC
  • concordances (even though I probably will not need those)

I was especially thrilled by the key word feature, as this makes possible to identifiy key topics when the 10 to 20 most frequent nouns in a text (or all texts of a period) are understood as indicators to the topics mostly dealt with. An example: I did a key word analysis on two texts of period one and found IT-words among the most frequent nouns. This was what I expected as the first weblog authors were mainly IT experts and their weblogs dealt (among other, more personal topics as in EatonWeb, for instance) with IT stuff, software, new links on the web, Apple vs. Microsoft and so on and so forth. I now hope to use this key word-tool for a broader analyses, aiming at extrapolating topical shifts across the periods.

So, currently I am working myself through all corpus texts again (330), doing the following steps (as always, I use SPSS for my statistics):

  1. I count the hyperlinks used in the entries. I differentiate between external links (the URL points to another domain), internal links (the URL remains within the same domain, links to categories, e.g. are internal as well), meta links (Permalinks, Trackbacks and Comment links, mostly at the end of postings; categories do not belong here and are counted as internal links as some period I weblogs already offer internal category links, but no other meta links. I also want to get neat data for the categories) and other links (mail:to, download etc.)
  2. I count other meso-structural features such as BlogRolls, guest books and so on. Maybe there are some trends that show after some counting…
  3. I determine a layout-type – Schlobinski & Siever (2005) suggested some and I extended their typology.
  4. I code the text in MAXQDA for special features like emoticons, rebus forms, oral features, graphostyle…
  5. I generate a pdf-file from the website which is imported to MAXQDA as well. This pdf-file is used for coding the language-image interplay and image types. Currently, I am doing some rough coding, intending to get more fine grained later on.
  6. I generate a .txt file with the postings of the weblog. This .txt file will later be used in WordSmith.

This procedure takes a while. As it is quite exhausting as well, I can only analyse around 20 texts per day. So that means around 6 weeks of work until I can move on to the WordSmith analyses and the language-image interplay (I’m really dreading that…).

Corpus Update

As I have pointed out in my first post, one comment about the diachronic corpus of Personal Weblogs my thesis is based on concerned the number of texts especially in the later periods (An outline of the corpus structure can be found in the talks “Anhything goes – everything done?” and “Stability, Diversity, and Change. The Textual Functions of Personal Weblogs”) People argued that a low number of texts was fine for period one, as there were only few weblogs around in these days. However, higher numbers of texts were expected for later periods as the access grew easier with more recent collection dates.

I have been thinking about these comments ever since, trying to find arguments for not extending the corpus. What I found, however, were quite weak excuses. Even more, I started wondering how I could justify a particular number of texts for a period in question at all. I came up with the following line of reasoning:

  • I work with both qualitative and quantitative methods, even though my general focus lies on the qualitative end of the continuum. Text numbers, therefore, have to be justified both from a qualitative and a quantitative point of view.
  • The qualitative framework of my thesis is heavily inspired by Grounded Theory (eg. following Glaser & Holton 2004). In Grounded Theory, there is a process called “Theoretical Sampling” combining data collection, coding and analysis. The basic idea is that data collection is guided by the emerging theory and strives for theoretical saturation. In other words: If nothing new is found, no conflicting cases, no cases challenging the categories established so far, the analyst has reached some point close enough to theoretical saturation to stop collecting samples. (footnote: He might as well have turned blind to new phenomena by excessive preceeding analysis. Anyway, further collection of samples would not help the research project in that case, either.) So that’s exactly my qualitative part of the argumentation: Collecting text samples until nothing new or challenging is discovered. This point had already almost been reached after collecting and analysing 80 to 90 texts for the periods II.A to II.C, but it was good to put my categories to the test by collecting more texts and assimilating them into my theory.
  • From a quantitative point of view, a researcher has to make some kind of informed guess on how many cases will probably be enough to make some statistically sound statements. One formula suggested by Raithel (2008: 62) uses the number of variables to be joint in one analytical step (e.g. a correlation study of two variables) and associated features (e.g. two features for the variable “gender”) ; this value is multiplied by 10: n >= 10 * K^V As I try to trace the change within several variables which are investigated apart from each other, my analytical steps quite often only contain one variable with a particular number of features. The variable with the highest number of features at present is the textual function with about ten distinct features (e.g. Update, Filter, Sharing Experience as outlined in my last post. Consequently, about 100 texts per period are roughly enough according to this formula. This is quite a tight budget; if I want to correlate the variable “textual function” with the variable “gender of author” I have to point out that the results give some hint at a possible statistical connection but have to be taken with a pinch of salt.

I think that both arguments taken together form a fairly stable basis for the justification of the number of cases. I guess 100 texts in the periods II.A, II.B and II.C are also a good compromise between striving for ever higher case numbers and the feasability of qualitatively and thoroughly analysing, say, 500 texts in each period.

So, after the extension phase that took me a bit more than one week of searching for texts, coding, basically repeating all analytical steps I had done before and updating the numbers in my thesis, the corpus looks like that now (snapshot from my screen, sorry for the quality):


Thoughts on chapter 7 again: primarily informative functional patterns

I’ve been working on chapter 7 of my thesis (textual functions of Personal Weblogs) for the last week and a half. Work is going well, even though I’m a bit worried about my time management and the amount of space this chapter will probably occupy in the end.

So far, I have finished a research review, the methodology and some  functions. As I have pointed out in this post, I basically present ethnocategories such as update, filter, or sharing experience and the linguistic descriptions of postings that belong in each category. Thus, I hope to blend the benefits of ethnographic studies (such as Nardi et al. 2004a and 2004b, Reed 2005, Brake 2007, Baumer et al. 2008) and detailed linguistic analysis. The advantage of combining two methodologies (apart from gaining a clearer insight into what actually characterises the different functions, i.e., how they are actually realised in Personal Weblogs) lies, in my opinion, in the opportunity of generating functional categories on the basis of linguistic analysis which are not mentioned explicitely / only vaguely by bloggers or by ethnographic studies, respectively. In order to be able to present categories without ethnographic counterpart (and for reasons of legibility), I have decided on presenting the functions arranged in the following groups (inspired by Brinker’s works):

  1. primarily informative functions
  2. primarily appellative functions
  3. primarily contact-oriented functions
  4. functions focussed on benefits of the writing process

I am currently working on informative functions. The chapter is structured like this:

  1. filter
  2. update
  3. sharing experience
  4. further primarily informative functions

Subchapter 4 is what I am currently focussed on. It is not as easy and clear-cut as the first three subchapters. I think, it should include the following functional patterns:

  • informing about external topics (cf. Puschmann 2009, 2010)
  • voicing opinions
  • review
  • giving advice

Today, I have covered the first point. I have discovered that it includes actually two patterns: First of all, postings that mimic a newspaper-like style and seem to belong into the category “journalistic blogging”. Secondly, postings aimed at some kind of knowledge transfer from experts to interested laypeople. I am not sure whether I should seperate these patterns, but I guess – as the postings of these groups look quite differently and the functions “providing the latest news” vs. “transferring expert knowledge in an understandable way” are distinct enough to treat them as different patterns.

While writing this, I realised that the function “knowledge transfer” is quite close to “giving advice” as the latter is some kind of knowledge transfer with the special twist of providing instructions. “Giving advice” also exhibits certain overlaps with “sharing experience” as the advice given in Personal Weblogs is often nothing else than knowledge gained by experience. I think, I should explicitely state these overlaps and use them for smoothely guiding the reader through the subchapter….

Tomorrow, I will make sure to split the first category of the subchapter (informing about external topics) into the two subcategories just mentioned. From there, I will continue with “giving advice” in order to provide a smooth transition. What a plan!

Outlook: I am still thinking about the quantification part of the chapter. How does a correlation study work? My idea is to create variables for each function in SPSS to be able to state for each weblog whether the specific function could be detected. I would like to use these variables for some sort of correlation to answer the question of which functions do usually co-occur in weblogs and whether functional clusters can be detected. I will give this some more thought and come back to it in the next post.