Fork me on GitHub

Sentence Boundary Detection (SBD).

Split text into sentences with a `vanilla` rule based approach (i.e working ~95% of the time).

  • Split a text based on period, question- and exclamation marks.
  • Skips (most) abbreviations (Mr., Mrs., PhD.)
  • Skips numbers/currency
  • Skips urls, websites, email addresses, phone nr.
  • Counts ellipsis and ?! as single punctuation
{{ textContent | tokenize:userOptions | json }}
{{ textContent | pluralize:userOptions }}

Toggle options:

  • newline_boundaries: Sentence split at newlines
  • html_boundaries: Sentence split at specific tags (br, and closing p, div, ul, ol)
  • sanitize: If you do not expect, or do not want html in your output
View all options