<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Ruby on David Hamann</title><link>https://davidhamann.de/tags/ruby/</link><description>Recent content in Ruby on David Hamann</description><generator>Hugo</generator><language>en</language><copyright>&amp;copy; David Hamann</copyright><lastBuildDate>Sat, 14 May 2022 00:00:00 +0000</lastBuildDate><atom:link href="https://davidhamann.de/tags/ruby/feed.xml" rel="self" type="application/rss+xml"/><item><title>Bypassing regular expression checks with a line feed</title><link>https://davidhamann.de/2022/05/14/bypassing-regular-expression-checks/</link><pubDate>Sat, 14 May 2022 00:00:00 +0000</pubDate><guid>https://davidhamann.de/2022/05/14/bypassing-regular-expression-checks/</guid><description>&lt;p&gt;Regular expressions are often used to check if a user input should be allowed for a specific action or lead to an error as it might be malicious.&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s say we have the following regular expression that should guard the application from allowing any characters that could be used to execute code as part of a template injection:&lt;/p&gt;
&lt;pre tabindex="0"&gt;&lt;code&gt;/^[0-9a-z]+$/
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;At first sight this looks OK: if we have a string containing only numbers and/or characters &lt;code&gt;a-z&lt;/code&gt; we will match them and can continue. If we have other characters and are thus not matching this pattern, we can error out. Injecting something like &lt;code&gt;abc&amp;lt;%=7*7%&amp;gt;&lt;/code&gt; or any other template injection pattern won&amp;rsquo;t work. Or will it? It depends&amp;hellip;&lt;/p&gt;
&lt;h2 id="implementations-matter"&gt;Implementations matter&lt;/h2&gt;
&lt;p&gt;Let&amp;rsquo;s compare the behavior in two environments: Ruby and Python.&lt;/p&gt;
&lt;p&gt;Ruby:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-ruby" data-lang="ruby"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;my_input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;abc&amp;lt;%= 7*7 %&amp;gt;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;my_input&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/^[0-9a-z]+$/&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nb"&gt;puts&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;Matches pattern, let&amp;#39;s continue...&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;else&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nb"&gt;puts&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;Does not match. Error out...&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Python:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;re&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;my_input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;abc&amp;lt;%= 7*7 %&amp;gt;&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="k"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;^[0-9a-z]+$&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;my_input&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Matches pattern, let&lt;/span&gt;&lt;span class="se"&gt;\&amp;#39;&lt;/span&gt;&lt;span class="s1"&gt;s continue...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Does not match. Error out...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Running either sample will lead to the error case.&lt;/p&gt;
&lt;p&gt;However, once we introduce a multi-line string, the two examples behave differently. Let&amp;rsquo;s change the &lt;code&gt;my_input&lt;/code&gt; to &lt;code&gt;abc\n&amp;lt;%= 7*7 %&amp;gt;&lt;/code&gt; in both snippets and run them again:&lt;/p&gt;
&lt;pre tabindex="0"&gt;&lt;code&gt;$ ruby test.rb
Matches pattern, let&amp;#39;s continue...
&lt;/code&gt;&lt;/pre&gt;&lt;pre tabindex="0"&gt;&lt;code&gt;$ python3 test.py
Does not match. Error out...
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;If an attacker is able to control &lt;code&gt;my_input&lt;/code&gt; in the above Ruby example, and this input is then actually used somewhere important (like in a template), it can lead to remote code execution and/or information disclosure.&lt;/p&gt;
&lt;h2 id="why-is-it-different"&gt;Why is it different?&lt;/h2&gt;
&lt;p&gt;In Ruby (but not only) the &lt;code&gt;^&lt;/code&gt; and &lt;code&gt;$&lt;/code&gt; match at the start and end of each line. So if any (!) one line is matching, we have a successful match. What we would rather want in this case is matching the beginning and end of the string, which is possible with &lt;code&gt;\A&lt;/code&gt; and &lt;code&gt;\z&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In Python, on the other hand, we would need to enable this multi-line behavior explicitly with &lt;code&gt;re.MULTILINE&lt;/code&gt;. Taking the example from above, this (probably unwanted behavior) would look like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="k"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;^[0-9a-z]+$&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;my_input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MULTILINE&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Matches...&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Note, though, that Python&amp;rsquo;s &lt;code&gt;re.match&lt;/code&gt; would generally only always match the beginning of the string, not the beginning of each line (see &lt;code&gt;re.search&lt;/code&gt; for scanning the full string).&lt;/p&gt;
&lt;h2 id="be-mindful-of-the-implementation"&gt;Be mindful of the implementation&lt;/h2&gt;
&lt;p&gt;When testing the security of a specific restriction, be mindful of which environment you are dealing with. Sometimes, it only takes a linefeed (&lt;code&gt;\n&lt;/code&gt;) to bypass a check and cause serious trouble.&lt;/p&gt;</description></item></channel></rss>