Does NodeJS need a StringBuilder class?

Does NodeJS need a StringBuilder class?

Strings in C/C++ are no more than a contiguous array of characters. std::string in C++ just extends and facilitates the usage of that concept.

It is well known that in garbage-collected programming languages strings tends to be immutable, that is: Once a string is created, it cannot be modified, so string[10] = 'c' is an invalid expression and string[10] is just read-only. It is fine the way it is, but it comes with a downside: every time you do any write operation with a string a copy of that must be made, so need to spend more space and processing time doing the operation.

That could have a significant impact on your application, that’s why those languages usually implement some kind of mutable string:

But what alternatives does NodeJS give us? If you quick-search, NodeJS doesn’t give us any alternatives, at least, not in the way the previous languages are providing one.

Let’s explore our options in NodeJS with an example.

Suppose we need to make a service, that service must return the full Lorem ipsum text N times, N should be a positive integer, to keep things simple.

Background:

You can get the sample code to run this test at GitHub.

await Lorem.as.text();

Yields a single, full Lorem ipsum text string.

await Lorem.as.buffer();

Returns a single, full Lorem ipsum but, as you can guess, in a Buffer fashion.

Version #1 - Simple concatenation:
export default async function generate_lorem_v1( lengths )
{
  const text = await Lorem.as.text();
  let   ret  = "";

  for( let i = 0; i < lengths; ++i )
  {
    ret += `${text}`;
    if( i < lengths - 1 ) ret += "\n\n";
  }

  return ret;
};
Version #2 - Buffer concatenation via File System:
export default async function generate_lorem_v2( lengths )
{
  const text     = await Lorem.as.text();
  const tempFile = await fs.open( "./lorem-ipsum-text.temp" , "w+" );

  for( let i = 0; i < lengths; ++i )
  {
    await tempFile.write( text );
    if( i < lengths - 1 ) await tempFile.write( "\n\n" );
  }

  // Reset the cursor to beginning of the file.
  await seek( tempFile , 0 , 0 );

  // Read the whole file content.
  // This will read a buffer and then turn it into a immutable string.
  const ret = await tempFile.readFile({ encoding : "utf8" });

  // Don't forget to close the file handler.
  await tempFile.close();

  return ret;
};
Version #3 - Buffer concatenation via Buffer class:
export default async function generate_lorem_v3_fast( lengths )
{
  const textBuffer   = await Lorem.as.buffer();
  const bufferLen    = textBuffer.length * lengths - SpacingBuffer.length;
  const outputBuffer = new Buffer.allocUnsafe( bufferLen );
  outputBuffer.fill( textBuffer );
  return DisableToString ? outputBuffer : outputBuffer.toString();
};

Another (maybe slower alternative could be)

export async function generate_lorem_v3_slow( lengths )
{
  const textBuffer   = await Lorem.as.buffer();
  const bufferLen    = textBuffer.length * lengths - SpacingBuffer.length;
  const outputBuffer = new Buffer.alloc( bufferLen , textBuffer , "utf8" );
  return DisableToString ? outputBuffer : outputBuffer.toString();
};
Version #4 - String concatenation via Array.join() class:
export default async function generate_lorem_v4( lengths )
{
  const txt = await Lorem.as.text();
  const arr = [];

  for( let i = 0; i < lengths; ++i ) arr.push( txt );

  return arr.join( "\n\n" );
};
The results are the following:
service version iteration #1 iteration #2 iteration #3 iteration #4 iteration #5
V1 - Concatenation 0ms (27.14 MB) 0ms (27.73 MB) 0ms (27.64 MB) 0ms (29.57 MB) 20ms (40.26 MB)
V2 - File system as buffer 26ms (27.44 MB) 10ms (28.41 MB) 80ms (36.20 MB) 735ms (97.33 MB) 6745ms (708.92 MB)
V3 - Direct Buffers 0ms (27.45) 0ms (27.92 MB) 1ms (32.91 MB) 7ms (63.60 MB) 74ms (374.65 MB)
V4 - Using Arrays 0ms (27.72 MB) 0ms (27.91 MB) 2ms (32.91 MB) 12ms (63.67 MB) 127ms (376.08 MB)
  • Iterations #1 → lengths = 10.
  • Iterations #2 → lengths = 100.
  • Iterations #3 → lengths = 1e3.
  • Iterations #4 → lengths = 1e4.
  • Iterations #5 → lengths = 1e5.

To my surprise, I didn’t expect that v2 - system file as buffer would use that amount of memory.

Conclusions:

NodeJS (V8) way to handle strings concatenation are usually fast enough on a good machine, so that’s why they didn’t add any StringBuilder like implementation.

Buffers are fast when you rely on their functionality and you’re in constant use of the Stream class, but in day-to-day code i’d normally use simple concatenation (the fastest) or Array.join() (still nice performance & good-looking).

Footnotes:

I do not agree with TheCodingHorror Blog this time.

There are plenty of languages that are garbage-collected and implements that sort of StringBuilder class, IT IS a common pattern across them, saying plain and simple, that It. Just. Doesn't. Matter! is basically saying that the creators/maintainers of those languages just added a feature for the sake of adding it. While i DO believe many languages have so much feature bloat, and i prefer simpler languages & interfaces like Lua, this is clearly not the case.