🦀
Rust Cookbook
  • Introduction
  • Rust Introduction
  • Collections
    • Hashing
  • Text Processing
    • Splitting a string
    • Converting a string to vectors and back
    • Twoway
  • Benchmarking
    • criterion
  • Testing
  • Package Management
    • Cargo workspaces
  • Concurrent Programming
    • Actor model
      • Actix actors
      • Bastion
  • Parallel Programming
    • Ryaon
  • Optimisations
    • Cache alignment
  • TODO
Powered by GitBook
On this page
  • How to split a string based on specific separators?
  • How to split a string based on multiple characters?
  • How to split a string based on regular expressions?
  1. Text Processing

Splitting a string

PreviousText ProcessingNextConverting a string to vectors and back

Last updated 5 years ago

How to split a string based on specific separators?

  • By separator - s.split("separator")

  • By whitespace - s.split_whitespace()

  • By newlines - s.lines()

How to split a string based on multiple characters?

Standard library split can be used with a closure to fulfill this use case.

fn split_stuff() {
    let message = "Hello there | how are you ; doing?";
    for s in message.split(|c| (c == '|') || (c == ';')) {
        dbg!(s);
    }
}

Output would be as follows,

s = "Hello there " s = " how are you " s = " doing?"

To get rid of spaces, method can be used.

Although it is possible to write a closure to split on arbitrarily complex conditions, it easier to use regex crate for splitting based on regular expressions.

for s in message.split(&['|', ';'][..]) {
// ...
}

Splits can be collected to a vector and other types of Rust collections with collect method.

How to split a string based on regular expressions?

Splitting a string such as "hello there; how| are|you, chathura" based on multiple characters and taking into account multiple spaces is not simple without using regular expressions.

Following solution demonstrates the use of regex crate to accomplish the task.

fn split_by_regex() {
    let message = "hello there; how|    are|you, Chathura";

    let re = regex::Regex::new(r"[;,|\s]\s*").unwrap();

    for s in re.split(message) {
        dbg!(s);
    }
}

Another approach without using a closure would be to make use of string slice . As an example, we can write the above split as,

trim()
patterns