Tl;dr

In some cases, the newly introduced std::string_view in C++17 is faster than std::string, because string_view only needs the pointer to the first element and the length of the string sequence, it omits string copy and provides an immutable-view of original data.

What is std::string_view?

The class template basic_string_view describes an object that can refer to a constant contiguous sequence of char-like objects with the first element of the sequence at position zero.

from cppreference.com

The purpose of string_view is to avoid copying data which is already owned somewhere else and of which only a non-mutating view is required, e.g. called as const reference from a function. The idea is to store a pair of pointer-of-first-element and size of the string.

For example, there are multiple ways to create a function where it needs (const) string reference. You might be familiar with the first two examples, but there are some problem.

When to use std::string_view?

1
2
3
4
5
void TakesCharStar(const char* s);             // C convention
void TakesString(const string& s); // Old Standard C++ convention

void TakesStringView(absl::string_view s); // Abseil C++ convention
void TakesStringView(std::string_view s); // C++17 C++ convention

If you already have a string, but the interface needs a const char*, it is acceptable but not convenient by calling c_str() function to convert it.

1
2
3
4
void SomeFunction() {
string s = "The string we have";
TakesCharStar(s.c_str()); // explicit conversion
}

If you already have a char pointer, but the interface needs a const string, you can pass the char pointer directly without calling any conversion method, however, it invokes the creation of a temporary string, copying the contents of that string, O(n) time.

1
2
3
4
void SomeFunction() {
char* s = "The char pointer we have";
TakesString(s); // requires string copy
}

Can we do better? string_view can handle both cases gracefully and efficiently.

1
2
3
4
5
6
void SomeFunction() {
string s1 = "The string we have";
char* s2 = "The char pointer we have";
TakesStringView(s1);
TakesStringView(s2);
}

string_view provides implicit conversion constructors that accept both char pointer and const string reference. Since they only need the pointer to the first char and the length of the string, they don’t introduce any string copy, so both constructors run in O(1) time.

1
2
3
4
5
6
7
8
template <typename Allocator>
string_view(
const std::basic_string<char, std::char_traits<char>, Allocator>&
str) noexcept
: ptr_(str.data()), length_(CheckLengthInternal(str.size())) {}

constexpr string_view(const char* str)
: ptr_(str), length_(CheckLengthInternal(StrLenInternal(str))) {}

What else

string_view can bring a huge performance gain. Let’s borrow two examples from Performance of std::string_view vs std::string from C++17.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#include <string_view>

static void StringSubStr(benchmark::State& state) {
std::string s = "Hello Super Extra Programming World";
// Code before the loop is not measured
for (auto _ : state) {
auto str = s.substr(1, 30);
// Make sure the variable is not optimized away by compiler
benchmark::DoNotOptimize(str);
}
}
// Register the function as a benchmark
BENCHMARK(StringSubStr);

static void StringViewSubStr(benchmark::State& state) {
std::string s = "Hello Super Extra Programming World";
for (auto _ : state) {
std::string_view sv = s;
auto str = sv.substr(1, 30);
benchmark::DoNotOptimize(str);
}
}
BENCHMARK(StringViewSubStr);

Online benchmark playground: Quick C++ Benchmark

Here is the benchmark result, with gcc-8.2, std of c++17 and O3 as optimization level. It’s a huge performance improvement.

img

Because it’s O(n) vs O(1).

The key difference is std::string::substr vs std::string_view::substr. The former returns a substring while the latter returns a view of a substring, the former runs in leaner complexity while the latter has constant complexity. Let’s take a look at its implementation in Absel. Skip the input check, to get the substring, basically what it does is simply move the pointer, no copy, which only has O(1) complexity!, its performance is independent of the size of the substring.

1
2
3
4
5
6
7
8
9
10
11
// string_view::substr()
//
// Returns a "substring" of the `string_view` (at offset `pos` and length
// `n`) as another string_view. This function throws `std::out_of_bounds` if
// `pos > size`.
string_view substr(size_type pos, size_type n = npos) const {
if (ABSL_PREDICT_FALSE(pos > length_))
base_internal::ThrowStdOutOfRange("absl::string_view::substr");
n = std::min(n, length_ - pos);
return string_view(ptr_ + pos, n);
}

Let’s switch to string_view

Emm, wait.

Adding string_view to your current code base is not always the best practice, changing the interface to pass by string_view can be inefficient and sometimes dangerous, if its inside implementation requires a string or a NULL-terminated const char pointer. For example, function A(const string& s) call B(const char* s), if you change A’s interface to A(std::string_view sv), then you need to handle the conversion from string_view to const char* by calling std::string_view::data(), it can be dangerous because sv is not guaranteed to be NUL-terminated. Therefore, it is encouraged to use string_view starting at the utility code and working upward, or keep the consistency when starting a new project.

Secondly, string_view does not hold the data. Always keep it in mind that string_view only provides an immutable-view of original data, in another word, the source string must outlive the string_view. A good example is do not create a class member with the type of string_view, it’s hard to guarantee its life cycle.

Summary

By introducing string_view, you can achieve a significant performance boost in string heavy operations. However, always keep these caveats in mind and use it in the right way.

Pros

  • It provides a more flexible interface, can accept both const char* and const string&.
  • It’s faster than std::string, in some cases.

Cons

  • It is not necessarily NUL-terminated, so in some case like printf it’s not safe.
  • The source data of the string_view must outlive the string_view itself.