Founding member if the COW initiative. Areas of expertise:
- crawling (ClaraX random walker with texrex)
- linguistic web characterization
- document classification (COReCo and COReX frameworks)
- web page cleaning/processing (texrex software suite)
- linguistic annotation (COW toolchain)
- languages: English, German, Swedish