Problem
I have written a small method that extracts certain information from a string. An example of such a string is
"Some text BLA/123/5345/349230498 some more text PNR: 12345678, Name: John, CallName: Peter, TimeStamp: 01.10.2015"
Now I need certain information from this string, e.g. the PNR
, Name
, CallName
, and TimeStamp
(as string
). Right now the method looks like this:
/// <summary>
/// Reads the value of a specified attribute from the log entry.
/// </summary>
/// <param name="identifier">The prefix used in the string (e.g. "Name" in "Name: John").</param>
/// <returns>The value of the attribute (e.g. "John" bei "Name: John".</returns>
private string GetValueFromMessage(string identifier)
{
int index = this.Message.IndexOf(identifier) + identifier.Length + 2;
if (index != -1)
{
char c = this.Message[index];
string result = string.Empty;
while (c != ',')
{
result += c;
index++;
if (index < this.Message.Length)
{
c = this.Message[index];
}
else
{
break;
}
}
return result;
}
return null;
}
Some parts that I don’t like about my code, and don’t know if they’re done correctly:
- It looks too long/not elegant to me. Specifically, I think the loop could be done more efficiently. I tried different loops, also tried with a line of the form
c = this.Message[++index]
, but then I ran into problems at the end of the string. - I want to get rid the double usage of
c = this.Message[index]
somehow. - Is it even good style to
return null
if there is no occurrence of the attribute? (As info: this can happen, but maybe it would be smarter to returnstring.Empty
then?) - Should I use
this.Message
instead of justMessage
when it is a public property of the surrounding class? - I also want to get rid of the
if
statement (and the uglybreak
) inside the loop if possible, but didn’t really find a way yet.
Maybe something along the lines of
while (index < this.Message.Length && c != ',')
would work?
Solution
I think that you should include the space before the identifier and the separator after the identifier, that way it will find the right identifier even if the string contains properties in a different order.
If the string contains for example CallName: Peter, NameAlias: Johnny, Name: John
you would want the last property when looking for Name
rather than the first property because it ends with Name
or the value "ias: Johnny"
from the second property.
(Including the space before the identifier naturally won’t find an identifier if it starts at the first character of the string, but that doesn’t seem to be the case with your data.)
You don’t need a loop to find the end of the value, you can use IndexOf
with a start index.
private string GetValueFromMessage(string identifier) {
identifier = " " + identifier + ": ";
int index = this.Message.IndexOf(identifier) + identifier.Length;
if (index != -1) {
int index2 = this.Message.IndexOf(",", index);
if (index2 == -1) {
index2 = this.Message.Length;
}
return this.Message.Substring(index, index2 - index);
}
return null;
}
Additionally, you can use .IndexOf(identifier, 0, StringComparison.OrdinalIgnoreCase)
if you want to make a case insensetive match on the identifier.
Using this.Message
or Message
is a matter of convention, neither is clearly better than the other. While this.Message
is more specific, Message
should usually be clear enough, but you have to be a bit more careful when naming members to avoid conflicts, but on the other hand you should always avoid those conflicts anyway.
If you know your delimiters, you can use them to break up the input string into smaller strings.
string input = @"Some text BLA/123/5345/349230498 some more text PNR: 12345678, Name: John, CallName: Peter, TimeStamp: 01.10.2015";
string value = String.Empty;
List<string> keyValuePairs = input.Split(',').ToList();
foreach (var keyValuePair in keyValuePairs)
{
string key = keyValuePair.Split(':')[0].Trim();
if (key == "Name")
{
value = keyValuePair.Split(':')[1];
}
}
You could go more functional in style, working off the splitting as shown by Dan to get something like this. The resulting dictionary could also be factored out and reused to extract other values. (Note: You may want to use TryGetValue instead of [“xxxxx”] depending on your data).
string input = @"Some text BLA/123/5345/349230498 some more text PNR: 12345678, Name: John, CallName: Peter, TimeStamp: 01.10.2015";
var value =
input
.Split(',')
.Select(
pair => pair.Split(':'))
.ToDictionary(
keyValue => keyValue[0].Trim(),
keyValue => keyValue[1].Trim())
["Name"];
Perhaps use a regex like this?
internal class Program
{
private const string MyInputString = @"Some text BLA/123/5345/349230498 some more text PNR: 12345678, Name: John, CallName: Peter, TimeStamp: 01.10.2015";
private static void Main()
{
try
{
var match = Regex.Match(MyInputString, @"PNR: (?<pnr>.*), Name: (?<name>.*), CallName: (?<callname>.*), TimeStamp: (?<timestamp>.*$)");
Console.WriteLine("PNR={0}, Name={1}, Callname={2}, Timestamp={3}",
match.Groups["pnr"],
match.Groups["name"],
match.Groups["callname"],
match.Groups["timestamp"]);
}
catch (Exception e)
{
//Add appropriate error handling here
Console.WriteLine(e.Message);
throw;
}
Console.ReadLine();
}
}
This will output:
PNR=12345678, Name=John, Callname=Peter, Timestamp=01.10.2015