Sunday, September 29, 2024

How to build and run the Rust implementation of tree-sitter

 How to build the Rust implementation of tree-sitter


$ sudo ./script/build-wasm


  (Note just running the script might go wrong due to the permission error when it attempts to access the docker.)


$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh


  (This installs rustc, the latest rust compiler at ~/.cargo as guided in https://rustup.rs/.)


$ source ~/.cargo/env


 (This activates the path of the installed rust toolchains.)


$ cargo build


 (This builds tree-sitter with the installed rust toolchains. With no option to the build mode, tree-sitter is built in a debug mode that provides various debug options. Alternatively, one can give --release to the build mode for a release.)


How to run tree-sitter, say, over tree-sitter-python?


$ curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash


  (Note one needs to setup nvm to install npm and other javascript relevant tools.)

$ nvm  install --lts

  (Install the latest npm and others.)

$ git clone https://github.com/tree-sitter/tree-sitter-python

  (For example, let us consider tree-sitter-python, which is a python parser using tree-sitter and uses an external its own lexer implementation not using tree-sitter.)

$ cd tree-sitter-python

Define TREE_SITTER as a release binary or a debug binary. 
  • export TREE_SITTER=/home/khchoi/work/lang/tree-sitter/tree-sitter/target/debug/tree-sitter
  • export TREE_SITTER=/home/khchoi/work/lang/tree-sitter/tree-sitter/target/release/tree-sitter

$ $TREE-SITTER generate --debug-build

 (Assuming $TREE-SITTER is a path to the tree-sitter directory, it generates src/parser.c and others. At this moment, it is unclear to me what is the effect of giving --debug-build to the generate mode.)

$TREE-SITTER build --debug

 (It changes ./src/node-types.json.)

$ $TREE-SITTER parse --debug YOUR-PYTHON-PROGRAM.py

 (It prints various logs including parsing actions and states, and then it prints an abstract syntax tree for the python program.)



Wednesday, February 14, 2024

Fixing an error "commitAndReleaseBuffer: invalid argument (invalid character)" in stack-building Haskell

 

On Windows, I often meet this annoying error:

 - stack build

   ...

   commitAndReleaseBuffer: invalid argument (invalid character)

   ...


This error seems to happen due to a locale setting on Windows. To see what is set for the locale, you can try the following PowerShell commands: 

 - Get-WinSystemLocale

 - Set-WinSystemLocale en-US

My locale was set for Korean. I changed it to en-US, and the error was magically gone away!


The locale setting command should run on PowerShell under the system manager mode.


On Ubuntu, I have rarely seen such an error, "commitAndReleaseBuffer: invalid argument (invalid character)".


More important thing: such an error is not just solely from locale settings, but it is actually caused by some real error in the haskell program to build. 


My guess is that there is something wrong in a Haskell program. Haskell stack or ghc detects it. It tries to write something about the detected error somewhere (perhaps, using SQL?), and the relevant Haskell library meets a locale setting problem, producing this famous error message:

  - commitAndReleaseBuffer: invalid argument (invalid character)