0.14 2018-11-29
[Ko van der Sloot]
* updated usage() and removed -S option (never used)
* make sure the right textclass is assigned to <w> nodes in FoLiA
* minor code fixes/refactorings
* added more tests
* updated man.1 page

[Maarten van Gompel]
* updated README.md

[Iris Hendrickx]
* Updated and extended the manual

0.13.2 2018-05-17
[Ko van der Sloot]
Bug fix release:
* uctodata is mandatory. So don't install default rules anymore

0.13.1 2018-05-17
[Ko van der Sloot]
Bug fix release:
* configure now finds out the location of the uctodata files.
  should make it work on Mac systems too

0.13 2018-05-16
[Ko van der Sloot]
* improved configure/build/test
* added a --split option
* fixed -P option
* removed -S option (never used, and only half implemented)
* added a --add-tokens option, to add special tokens for the default language
* generally use the icu:: namespace
* added more tests
* fixed uninitialized variable.
* added code to use an alternative search-path for uctodata

[Maarten van Gompel]
* added codemeta.json

0.12 2018-02-19
[Ko van der Sloot]
* now use the UniFilter Unicode Filter from ticcutils
* now use the UnicodeNormalizer from ticcutils
* improved configuration. Support vor Mac OSX added

0.11 2017-12-04
[Ko van der Sloot]
Bug fix release:
* problems with text inside Cell elements

0.10 2017-11-07
[Ko van der Sloot]
New release due to outdated files in the previous release.

0.9.9 2017-11-06
[Ko van der Sloot]
Minor fix:
* bumped the .so version to 3.0.0

0.9.8 2017-10-23
[Ko van der Sloot]
Bug-fix release
* fixed utterance handling in FoLiA input. Don't try sentence detection!

0.9.7 2017-10-17
[Ko van der Sloot]
 * added textredundancy option, default is 'minimal'
 * small adaptations to work with FoLiA 1.5 specs
   - set textclass on words when outputclass != inputclass
   - DON'T filter special characters when inputclass == outputclass
 * -F (folia input) is automatically set for .xml files
 * more robust against texts with embedded tabs, etc.
 * more and better tests added
 * better logging and error messaging
 * improved language handling. TODO: Language detection in FoLiA
 * bug fixes:
   - correctly handle xml-comment inside a <t>
   - better id generation when parent has no id
   - better reaction on overly long 'words'

0.9.6 2017-01-23
[Maarten van Gompel]
* Moving data files from etc/ to share/, as they are more data files than
  configuration files that should be edited.
* Requires uctodata >= 0.4.
* Should solve debian packaging issues (#18)
* Minor updates to the manual (#2)
* Some refactoring/code cleanup, temper expectations regarding ucto's
  date-tagging abilities (#16, thanks also to @sanmai-NL)

0.9.5 2017-01-06
[Ko van der Sloot]
Bug fix release:
 * updated tokconfig-generic, which is removed from the uctodata package
 * configure no longer insists on the presence of uctodata, it merely warns
   when missing

0.9.4 2017-01-05
[Ko van der Sloot]
Major update
* Language support
 - added support for multiple languages
 - auto detection of languages using textcat
* some refactoring
  - no more call to exit()
  - Better logging and Warning messages
  - some folia output improvements
* bug fixes
  - in passthru,
  - issue #11

0.9.3 2016-09-28
[Ko van der Sloot]
Major update:
- require ICU 5.2
- implemented recursive application of rules. (which may be dangerous)
- modfied tests, because not all failures wre detected correctly
- check the uctodata version. version > 0.2 is preferred.

0.9.1 2016-07-12
[Ko van der Sloot]
Bug fix release:
- fixed autoconfig issue

0.9.0 2016-07-11
[Ko van der Sloot]
Major update
- now use uctodata for language specific information
  ucto itself only supports a generic tokenizer
- interactive use now uses readline library
- accept long options --help and --verision
- UTF16BE now works
- better support for crooked Windows files in general
- added a --normalize option to map tokens in a certain TokenClass
  to it's generic name

0.8.6 2016-04-25
[Ko van der Sloot]
* Bug fix release: fixing Sentence boundaries after abbreviations

0.8.5 2016-04-25
[Ko van der Sloot]
* Bug fix release: Better handling of regexps

0.8.4 2016-03-10
[Ko van der Sloot]
* implemented on top of libfolia 1.0

0.8.1 2016-01-14
[Ko van der Sloot]
* repository moved to GIT
* added Travis support
* more tests added
* added META-RULES code
* %include now supports full paths
* updated some languages
* fixed passthru mode
* code cleanup

0.8.0 2015-01-29
[Ko van der Sloot]
* next release
[Maarten van Gompel]
* added new tokenize(string,string) meta-function for the API
* allatonce enabled by default for tokenize() to folia doc
* fixing date rules and adding FRACNUMBER
* added Russian
* Adicionei regras para tokenização portuguesa.
[Antal vd Bosch]
* added RK to dutch abbrev.

0.7.0 2014-11-26
[Ko van der Sloot]
* unofficial release
* experimental PUNCTUATION filter
* bug fixes
[Maarten van Gompel]
* reduced memory usage

0.6.0 2014-09-23
[Ko van der Sloot]
* release

0.5.5 2014-06-xx
* made getSentence() public
* adapted to most recent libfolia (0.11 or above)
* needs libticcutils 0.6 or above
* uses TiCC::CommandLine
* detect EMOTICON's
* generally switched to UChar32 and Unicode codepoints. (avoid length() problems)
* handle FoLiA Note like Caption
* a lot of bug fixes concerning FoLiA output (<t> nodes, textclass values etc.)
* again some changes around quotes
* improved tokenisation in differeny languages
* added swedisch

0.5.3 2013-04-04
[Folgert Karsdorp]
* Fixed quote detection, added tests. still shaky and default disabled
[Ko van der Sloot]
* changed verbose output slightly
* fixed id's in folia output
* various folia fixes
* honour BOM markers in input file
* lots of configuration updates
* some fixes in handling if RULES

0.5.2 2012-03-29
[Ko vd Sloot]
* some small changes. Made it work with libfolia 0.9

0.5.1 2012-02-27
[Ko vd Sloot]
* added 'escape' possibility for regexps that start with a [
* better debugging output
* removed all (?i) stuff from regexps. This attempts to avoid an ICU bug
* added -X en --id= options
* adapted to libfolia 0.8 (/tests too!)
* some cleanup and refactoring

[Maarten van Gompel]
* added better rules for apostrophs in ATTACHEDSUFFIX and TOKENS

0.5.0 2012-01-09
[Ko vd Sloot]
* added a different and more powerpull SMILEY rule. Which happens also to work
  on older ICU versions

0.4.9 2011-12-21
[Ko vd sloot]
 * reworked and more folia integration

0.4.8 2011-11-02
[Ko vd sloot]
 * use libfolia to generate folia XML

0.4.7 - (not released yet, feel free to add more stuff)
[ Maarten van Gompel ]
* Fix: proper XML entities in FoLiA output
* fixed bug77 (the NOSPACE bug)
* Fix: Nested quote problem (2011-08-18)
* Improved protection against unbalanced quotes/sentences (2011-08-18)

[Ko van der Sloot]
* fixed passthru encoding problem
* fixed problem with CRLF separated lines (bug 78)
* configdir vs. config file hassle moved more inside. simpler API now.
* -Q option works reversed now. -Q Enables Quote detection.
  Quote detection apears to be very hard and fragile.

0.4.6 - 2011-05-17
[Ko van der Sloot]
* changed the regexp for KNOWN-ABBREVIATIONS to case sensitive
* fixed include file handling for non-standard locations
* fixed a problem with NON-Unix files. ucto would crash on a line with just '\r'

0.4.5 - 2011-04-27
[ Maarten van Gompel]
* Added sentenceperline support for PassThru mode , improved sentenceperline support for normal mode

[ Ko vd Sloot ]
* on failue, ucto didn't use the right exit code. 0 == SUCCESS (on most systems)
* added functions to display version info.

0.4.4 - 2011-03-31
[ Maarten van Gompel]
* fixed "fatal error: ucto: out of range :No sentence exists with the specified index" problem. (Bug 65)

[ Ko van der Sloot ]
* Fixed terrible bug. Unicode strings were output in the current locale.
  But we advertise UTF8

0.4.3 - 2011-03-19

[ Ko van der Sloot ]
* src/ucto.cxx: fixed --passthru problem
* tests/testpassthru.ok: test now works

[ Joost van Baal ]
* NEWS: record changes and releases


0.4.2 - 2011-03-17

[ Ko van der Sloot ]
* include/ucto/tokenize.h, src/tokenize.cxx,
  src/unicode.cxx: passes -pedantic
* configure.ac: some cleanup, bumped version
* include/ucto/tokenize.h, src/tokenize.cxx, src/ucto.cxx:
  added (hidden) --passthru option
* [r8842] tests/passthru.txt, tests/testall, tests/testpassthru,
  added a passthru test.
  has t0 be tested :)
* include/ucto/tokenize.h, src/tokenize.cxx: make compiler
  more happy
* docs/ucto.1: added description, smal update


0.4.1 - 2011-03-11

[ Ko van der Sloot ]
* src/tokenize.cxx: fixed regexp and error messag
* config/tokconfig-nl, src/tokenize.cxx: added the
  possiblity to ste the order of RULES in the config file
* tests/bug0063.nl.tok.V, tests/bug0063.nl.txt: added a
  test for bug63
  Not sure about the 'correct' solution
* docs/ucto.1: updated man page

[ Maarten van Gompel ]
* src/tokenize.cxx: fixed passthruline (skip=t) bug, FoLiA XSL has to be
  local unfortunately
* tests/bug0063.nl.tok.V: override
* config/tokconfig-nl, src/tokenize.cxx,
  tests/bug0052.nl.tok.V, tests/normalisation.nl.tok.V,
  tests/test.nl.tok.V: fix bug0063

0.4.0 - 2011-03-04

[ Maarten van Gompel ]
* logo.svg: added logo


0.3.7 - 2011-03-01

[ Ko van der Sloot ]
* [r8636] tests/testoption1.ok, tests/testusage.ok: these tests
  give a different outcome now.
* [r8318] src/tokenize.cxx: added experimental code to use the -n
  option ( output one sentence per line) also to process the input
  one sentence per line
* [r8317] tests/bug0054.nl.tok.V, tests/bug0054.nl.txt: testcase
  for bug0054

[ Maarten van Gompel ]
* [r8618] include/ucto/tokenize.h, src/tokenize.cxx, src/ucto.cxx:
  sentence per line input and output: two modes
* [r8617] src/tokenize.cxx, tests/bug0048.nl.tok.V,
  tests/bug0054.nl.tok.V: Fixed bug 54
* [r8615] src/tokenize.cxx, tests/abbreviations.nl.tok.V,
  tests/nu.nl.tok.V, tests/test.nl.tok.V: fixes
* [r8614] src/tokenize.cxx: FoLiA improvement


0.3.6 - 2011-02-12

[ Ko van der Sloot ]
* tests/: more tests added
* configure.ac, include/ucto/tokenize.h, src/tokenize.cxx,
  src/ucto.cxx, tests/testnormalisation: added possibility to set
  the inputEncoding breaks ucto user interface!


0.3.5 - 2011-02-10

[ Ko van der Sloot ]
* src/ucto.cxx: fix memory leak
* include/ucto/tokenize.h, include/ucto/unicode.h,
  src/Makefile.am, src/tokenize.cxx, src/ucto.cxx, src/unicode.cxx,
  include/ucto/tokenize.h, src/unicode.cxx: added copyright notice
* include/ucto/tokenize.h, src/tokenize.cxx, src/ucto.cxx: -f option
  now works
* config/tokconfig-nl, include/ucto/tokenize.h, src/tokenize.cxx,
  src/ucto.cxx: added support for ligature filtering and Unicode
  normalizing.  a bit rough still
* tests/: more tests added
* ucto.pc.in: now uses ucto-icu.pc

[ Maarten van Gompel ]
* configure.ac: version bump


0.3.4 - 2011-01-27

[ Joost van Baal ]
* Makefile.am, configure.ac, icu.pc.in, ucto-icu.pc.in: rename icu.pc to
  ucto-icu.pc: be sure we wont suffer from filename clashes in the future.
  Once Debian and other distos ship icu 4.6's usr/lib/pkgconfig/icu-io.pc
  (released 2010-12-02) we can get rid of our local copy.

[ Ko van der Sloot ]
* tests/: more tests added

[ Maarten van Gompel ]
* include/ucto/tokenize.h, src/tokenize.cxx: Updates in FoLiA support

0.3.3 - 2011-01-27

[ Joost van Baal ]
* Various bugfixes

0.3.2 - 2011-01-27

[ Ko van der Sloot ]
* Various bugfixes

0.3.1 - 2011-01-26

[ Ko van der Sloot ]
* Various bugfixes

0.3.0 - 2011-01-26

[ Maarten van Gompel ]
* tests/: Added lots of tests
* configure.ac, include/ucto/tokenize.h, src/tokenize.cxx, src/ucto.cxx:
  major refactoring. Improved buffering, less unnecessary storing of
  token/sentence vectors in memory. Improved quote support.
* include/ucto/tokenize.h, src/tokenize.cxx: Ucto now remembers if a token
  was spaced or not in the original. Enabling ucto to recontruct the original
  text exactly.
* include/ucto/tokenize.h, src/tokenize.cxx: Added quote detection support
* include/ucto/tokenize.h, src/tokenize.cxx, src/ucto.cxx: Added preliminary
  FoLiA XML output support in ucto
* include/ucto/tokenize.h, src/tokenize.cxx, src/ucto.cxx: Big API overhaul

[ Peter Berck ]
* config/Makefile.am, config/tokconfig-sv Added Swedish tokconfig

[ Ko van der Sloot ]
* config/tokconfig-nl, src/tokenize.cxx: read QUOTES from config file
* src/ucto.cxx: refuse to run when inputfile is bad
* docs/ucto.1: added a simple 'man' page
* src/ucto.cxx: added al -p switch to disable paragraph
  detection

0.0.1 - 2010-12-25

- First snapshot release

unreleased - 09-12-2010

- started to create a separate package
