Encode and Decode Strings medium
Problem Statement
Design an algorithm to encode a list of strings to a string. The encoded string is then sent over the network and is decoded back to the original list of strings.
Please note: The encoded string on the network is limited to a length of 10000.
Example 1:
Input: ["hello","world"] Output: ["hello","world"] Explanation: "hello/world" is not valid because the delimiter '/' could be part of the strings.
Example 2:
Input: ["we","say","yes"] Output: ["we","say","yes"]
Steps:
-
Encoding:
- We'll use a length-prefixed encoding scheme. Each string will be prepended with its length (as a 4-byte integer). This avoids the need for delimiters that might clash with characters within the strings themselves.
-
Decoding:
- We'll read the encoded string 4 bytes at a time to extract the length of each string.
- Then, we'll read that many bytes to get the string itself.
- We'll repeat this process until the end of the encoded string.
Explanation:
The key to solving this problem efficiently and robustly is to avoid relying on delimiters that might appear within the strings. Using a length-prefix guarantees that we can correctly parse the encoded string regardless of its contents. The 4-byte integer provides enough space to encode string lengths up to a substantial size, sufficient for most practical scenarios.
Code (C++)
#include <iostream>
#include <vector>
#include <string>
using namespace std;
class Codec {
public:
// Encodes a list of strings to a single string.
string encode(vector<string>& strs) {
string encodedStr = "";
for (const string& str : strs) {
int len = str.length();
// Convert length to 4-byte integer representation
encodedStr += string((char*)&len, sizeof(len));
encodedStr += str;
}
return encodedStr;
}
// Decodes a single string to a list of strings.
vector<string> decode(string s) {
vector<string> decodedStrs;
int i = 0;
while (i < s.length()) {
// Extract length (4 bytes)
int len;
memcpy(&len, s.c_str() + i, 4);
i += 4;
// Extract string
decodedStrs.push_back(s.substr(i, len));
i += len;
}
return decodedStrs;
}
};
int main() {
Codec codec;
vector<string> strs = {"hello", "world"};
string encoded = codec.encode(strs);
vector<string> decoded = codec.decode(encoded);
for (const string& str : decoded) {
cout << str << " ";
}
cout << endl; // Output: hello world
vector<string> strs2 = {"we", "say", "yes"};
encoded = codec.encode(strs2);
decoded = codec.decode(encoded);
for (const string& str : decoded) {
cout << str << " ";
}
cout << endl; // Output: we say yes
return 0;
}
Complexity Analysis:
-
Time Complexity:
- Encoding: O(N), where N is the total number of characters in all strings.
- Decoding: O(N)
-
Space Complexity:
- Encoding: O(N) (to store the encoded string)
- Decoding: O(N) (to store the decoded strings)
The space complexity is dominated by the size of the input and output strings. The algorithm itself uses a constant amount of extra space beyond that.