Last modified by Carlos Ramisch on 2014-08-22
From version 6.1
edited by Alexander Kobzar
on 2014-02-10
To version 7.1
edited by Carlos Ramisch
on 2014-08-22
Change comment: There is no comment for this version
Metadata changes
Property Previous value New value
Document author Alexander Kobzar Carlos Ramisch
Content changes
(% style="font-size: 25px; line-height: 1.2em; color: rgb(77, 77, 77);" %)1. INTRODUCTION= 1. INTRODUCTION =
(% style="font-family: sans-serif; font-style: normal;" %)= 2. SCRIPTS =
= (% style="font-size: 25px;" %)2. SCRIPTS(%%) =
== (% style="font-size: 23px; line-height: 1.2em; color: rgb(77, 77, 77);" %)1.1. Prerequisites(%%)1.1. Prerequisites ==
== (% style="font-size: 23px; line-height: 1.2em; color: rgb(77, 77, 77);" %)1.2. Description(%%)1.2. Description ==
* (% style="font-size: 14px;" %)**csv2text**[[csv2text>>attach:csv2text.zip]] extracts plain text corpus after it has been preprocessed by mwetoolkit.
* (% style="font-size: 14px;" %)**splitcorpus**[[splitcorpus>>attach:splitcorpus.zip]] divides a corpus into train and test sets. The latter contains all the sentences having MWE.
* (% style="font-size: 14px;" %)**mwe2blast**[[mwe2blast>>attach:mwe2blast.zip]] generates a Blast file based on a Moses-generated translation and word-to-word alignment information.
* (% style="font-size: 14px;" %)**filterblast**[[filterblast>>attach:filterblast.zip]] deletes sentences in a Blast files based on a number of criteria – wordcount, annotation and patterns for wrongly identified phrasal verbs in split word order (optional).
* (% style="font-size: 14px;" %)**mergeblast**[[mergeblast>>attach:mergeblast.zip]] joins two Blast files into a single one based on a user-defined dissimilarity criterion
* [[Manual.txt>>attach:Manual.txt]] explains how to use the tools

Copyright 2004-2017 XWiki
2.4.30467