KenLM for Developers
Up to the main page
Download the source.
Integrating
- Copy kenlm into your source tree. Distributing with your decoder is encouraged. LICENSE
- Omit the lm/filter, lm/builder, and util/stream directories if you only want query support. Omit python if you don't use Python.
- If using your own build system (recommended), delete
windows
and reimplement compile_query_only.sh
(for queries) or the CMakeLists.txt
files (for everything).
- Choose Boost, ICU, zlib, bzip2, and lzma support. See
README.md
in the source.
- Code against the interface in the next section.
- If your system does not generate hypotheses left-to-right, see
lm/left.hh
for a higher-level interface with left state minimization.
Interface Example
The interface is designed for efficient use inside a decoder:
#include "lm/model.hh"
#include <iostream>
#include <string>
int main() {
using namespace lm::ngram;
Model model("file.arpa");
State state(model.BeginSentenceState()), out_state;
const Vocabulary &vocab = model.GetVocabulary();
std::string word;
while (std::cin >> word) {
std::cout << model.Score(state, vocab.Index(word), out_state) << '\n';
state = out_state;
}
}
Keeping state is recommended for speed, but not required.
More Documentation
Public APIs appear in
lm/virtual_interface.hh
and
lm/model.hh
. A paragraph documents each call.