<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
    <title>gubins.lv</title>
    <subtitle>Personal website of Ilja Gubins</subtitle>
    <link rel="self" type="application/atom+xml" href="https://gubins.lv/atom.xml"/>
    <link rel="alternate" type="text/html" href="https://gubins.lv"/>
    <generator uri="https://www.getzola.org/">Zola</generator>
    <updated>2022-01-02T00:00:00+00:00</updated>
    <id>https://gubins.lv/atom.xml</id>
    <entry xml:lang="en">
        <title>Finding hyperrectangle vertex coordinates</title>
        <published>2022-01-02T00:00:00+00:00</published>
        <updated>2022-01-02T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://gubins.lv/posts/hyperrectangle/"/>
        <id>https://gubins.lv/posts/hyperrectangle/</id>
        
        <content type="html" xml:base="https://gubins.lv/posts/hyperrectangle/">&lt;p&gt;For the past two years I have been working on a python library for tiling and merging of n-dimensional numpy arrays, &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;the-lay&#x2F;tiler&quot;&gt;tiler&lt;&#x2F;a&gt;. From the first release it had a method to return bounding boxes of resulting tiles but it returned only minimum (i.e., bottom-left) and maximum (i.e., top-right) corners. Recently I needed to find all vertices of such bounding box, based on two original opposite vertices returned by &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;the-lay.github.io&#x2F;tiler&#x2F;#Tiler.get_tile_bbox&quot;&gt;&lt;code&gt;Tiler.get_tile_bbox()&lt;&#x2F;code&gt;&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;The solutions that come to mind first are to calculate coordinates with multiple nested for loops. While this part is not very performance critical, I was still interested in trying to rewrite that in numpy to avoid for loops. I found an algorithm on &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;stackoverflow.com&#x2F;questions&#x2F;57064879&#x2F;finding-coordinate-of-rectangle-in-n-dimension&#x2F;57065356#57065356&quot;&gt;stackoverflow&lt;&#x2F;a&gt; that involved iterating through 2^n numbers and then iterating through each bit of that number in binary. Pretty cool approach, but implementing this straight in Python seemed a nuissance: converting an integer to binary produces a string (i.e., &lt;code&gt;bin(9) =&amp;gt; &#x27;0b1001&#x27;&lt;&#x2F;code&gt;), then iterating through that string and checking with and if-else whether it is 0 or 1 and then sampling correct coordinate…&lt;&#x2F;p&gt;
&lt;p&gt;I didn’t see any other implementation of this on the internet, so here I am, sharing my Sunday night solution, with just one list comprehension. Essentially it creates an array of combinations of bits up to 2^n and then use it for indexing from interval array.&lt;&#x2F;p&gt;
&lt;p&gt;Probably it can be optimized even further 🤔&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;def &lt;&#x2F;span&gt;&lt;span style=&quot;color:#8fa1b3;&quot;&gt;get_all_corners&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;bottom_left_corner&lt;&#x2F;span&gt;&lt;span&gt;, &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;top_right_corner&lt;&#x2F;span&gt;&lt;span&gt;):
&lt;&#x2F;span&gt;&lt;span&gt;    n_dim = &lt;&#x2F;span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;len&lt;&#x2F;span&gt;&lt;span&gt;(bottom_left_corner)
&lt;&#x2F;span&gt;&lt;span&gt;    mins = np.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;minimum&lt;&#x2F;span&gt;&lt;span&gt;(bottom_left_corner, top_right_corner)
&lt;&#x2F;span&gt;&lt;span&gt;    maxs = np.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;maximum&lt;&#x2F;span&gt;&lt;span&gt;(bottom_left_corner, top_right_corner)
&lt;&#x2F;span&gt;&lt;span&gt;    intervals = np.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;stack&lt;&#x2F;span&gt;&lt;span&gt;([mins, maxs], -&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;&#x2F;span&gt;&lt;span&gt;)
&lt;&#x2F;span&gt;&lt;span&gt;    indexing_bits = np.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;array&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;list&lt;&#x2F;span&gt;&lt;span&gt;(itertools.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;product&lt;&#x2F;span&gt;&lt;span&gt;([&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0&lt;&#x2F;span&gt;&lt;span&gt;, &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;&#x2F;span&gt;&lt;span&gt;], &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;repeat&lt;&#x2F;span&gt;&lt;span&gt;=n_dim)))
&lt;&#x2F;span&gt;&lt;span&gt;    corners = np.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;stack&lt;&#x2F;span&gt;&lt;span&gt;([intervals[x][indexing_bits.T[x]] &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;x &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;range&lt;&#x2F;span&gt;&lt;span&gt;(n_dim)], -&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;&#x2F;span&gt;&lt;span&gt;)
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;return &lt;&#x2F;span&gt;&lt;span&gt;corners
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;– Ilja&lt;&#x2F;p&gt;
</content>
        
    </entry>
    <entry xml:lang="en">
        <title>Visualizing years of messaging on Facebook</title>
        <published>2021-04-26T00:00:00+00:00</published>
        <updated>2021-04-26T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://gubins.lv/posts/messenger/"/>
        <id>https://gubins.lv/posts/messenger/</id>
        
        <content type="html" xml:base="https://gubins.lv/posts/messenger/">&lt;p&gt;Back in winter 2016, I have discovered &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;dataisbeautiful&#x2F;top&#x2F;?sort=top&amp;amp;t=all&quot;&gt;&#x2F;r&#x2F;DataIsBeautiful&lt;&#x2F;a&gt; (and &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;dataisugly&#x2F;top&#x2F;?t=all&quot;&gt;&#x2F;r&#x2F;DataIsUgly&lt;&#x2F;a&gt;) and when the holidays came, I’ve set the goal of learning D3.js and creating a nice visualization myself. I saw &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;dataisbeautiful&#x2F;comments&#x2F;5pi9sn&#x2F;my_year_in_facebook_messages_created_with_d3js_oc&#x2F;dffjsr1&#x2F;&quot;&gt;a reddit post&lt;&#x2F;a&gt; and immediately thought that was something I wanted to try to replicate. It took me few full days of learning, but &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;i.imgur.com&#x2F;yFJvj7g.jpg&quot;&gt;I was really happy with the result&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;As of now, more than 4 years have passed and it would be interesting to see how my messaging patterns and habits changed. Unfortunately, back then I was not as careful with my side projects as I am now, and I lost all the code. Moreover, I (happily) haven’t touched Javascript for a few years now, and I am hoping to continue with that. For my 2021 spring vacation, I’ve set a goal of redoing the visualization in Python, with new data and document everything properly for a blog post, and hopefully finding something new along the way. If you are reading this on my website, the experiment is a success!&lt;&#x2F;p&gt;
&lt;h2 id=&quot;downloading-facebook-data&quot;&gt;Downloading Facebook data&lt;&#x2F;h2&gt;
&lt;p&gt;First, we need to obtain the messaging data. Few years ago I had to use some third-party scripts that would scroll through my messages and save them one by one. The technology (or more precisely, enforcement of GDPR) moves forward and now Facebook provides a nifty tool with which you can download the data directly. Go to &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.facebook.com&#x2F;dyi&quot;&gt;facebook.com&#x2F;dyi&lt;&#x2F;a&gt;, select Messages, JSON format and press Create File. Depending on the amount of messages you have, it will take anywhere between half an hour and a day to receive the download link.&lt;&#x2F;p&gt;


	&lt;img src=&quot;https:&amp;#x2F;&amp;#x2F;gubins.lv&amp;#x2F;processed_images&amp;#x2F;facebook_dyi.228d0991ea2aa510.png&quot; class=&quot;is-center image-center&quot; &#x2F;&gt;
&lt;p&gt;While not needed for this project and would probably take longer to create data archive, I highly recommend to request all of the possible data, not just messages. It is quite eye opening to see how much exactly information Facebook collects, and not just on their websites, but &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.eff.org&#x2F;deeplinks&#x2F;2020&#x2F;01&#x2F;how-change-your-facebook-activity-settings&quot;&gt;from third parties&lt;&#x2F;a&gt;, &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.cnet.com&#x2F;news&#x2F;heres-how-facebook-collects-your-data-when-youre-logged-out&#x2F;&quot;&gt;even when you are not logged in&lt;&#x2F;a&gt;.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;parsing-and-filtering-messages&quot;&gt;Parsing and filtering messages&lt;&#x2F;h2&gt;
&lt;p&gt;Messages are stored in &lt;code&gt;message_*.json&lt;&#x2F;code&gt; files in &lt;code&gt;messages&lt;&#x2F;code&gt; folder of your Facebook data archive. Let’s read all of them (and &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;stackoverflow.com&#x2F;a&#x2F;50011987&#x2F;1668421&quot;&gt;fix facebook’s mojibake&lt;&#x2F;a&gt;) and make a list of all messages that we can find:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;json
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;from &lt;&#x2F;span&gt;&lt;span&gt;pathlib &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;Path
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# fix mojibake: https:&#x2F;&#x2F;stackoverflow.com&#x2F;a&#x2F;62160255&#x2F;1668421
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;def &lt;&#x2F;span&gt;&lt;span style=&quot;color:#8fa1b3;&quot;&gt;parse_obj&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;obj&lt;&#x2F;span&gt;&lt;span&gt;):
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;if &lt;&#x2F;span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;isinstance&lt;&#x2F;span&gt;&lt;span&gt;(obj, str):
&lt;&#x2F;span&gt;&lt;span&gt;        &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;return &lt;&#x2F;span&gt;&lt;span&gt;obj.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;encode&lt;&#x2F;span&gt;&lt;span&gt;(&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;latin_1&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;).&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;decode&lt;&#x2F;span&gt;&lt;span&gt;(&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;utf-8&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;)
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;if &lt;&#x2F;span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;isinstance&lt;&#x2F;span&gt;&lt;span&gt;(obj, list):
&lt;&#x2F;span&gt;&lt;span&gt;        &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;return &lt;&#x2F;span&gt;&lt;span&gt;[&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;parse_obj&lt;&#x2F;span&gt;&lt;span&gt;(o) &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;o &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;obj]
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;if &lt;&#x2F;span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;isinstance&lt;&#x2F;span&gt;&lt;span&gt;(obj, dict):
&lt;&#x2F;span&gt;&lt;span&gt;        &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;return &lt;&#x2F;span&gt;&lt;span&gt;{key: &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;parse_obj&lt;&#x2F;span&gt;&lt;span&gt;(item) &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;key, item &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;obj.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;items&lt;&#x2F;span&gt;&lt;span&gt;()}
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;return &lt;&#x2F;span&gt;&lt;span&gt;obj
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;message_files = [f &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;f &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;Path&lt;&#x2F;span&gt;&lt;span&gt;(&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;path&#x2F;to&#x2F;messages&#x2F;folder&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;).&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;rglob&lt;&#x2F;span&gt;&lt;span&gt;(&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;message_*.json&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;)]
&lt;&#x2F;span&gt;&lt;span&gt;messages = []
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;m &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;message_files:
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;with &lt;&#x2F;span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;open&lt;&#x2F;span&gt;&lt;span&gt;(m, &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;r&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;) &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;as &lt;&#x2F;span&gt;&lt;span&gt;f:
&lt;&#x2F;span&gt;&lt;span&gt;        messages += &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;parse_obj&lt;&#x2F;span&gt;&lt;span&gt;(json.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;load&lt;&#x2F;span&gt;&lt;span&gt;(f)[&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;messages&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;])
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Each element of &lt;code&gt;messages&lt;&#x2F;code&gt; is a dictionary and depending on the type, fields can differ. In my archive I have found five types:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span&gt;&amp;gt;&amp;gt;&amp;gt; &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;set&lt;&#x2F;span&gt;&lt;span&gt;([m[&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;type&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;] &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;m &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;messages])
&lt;&#x2F;span&gt;&lt;span&gt;[&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;Share&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;, &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;Subscribe&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;, &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;Unsubscribe&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;, &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;Generic&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;, &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;Call&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;]
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Of these five, I am interested only in &lt;code&gt;Share&lt;&#x2F;code&gt; (sending links) and &lt;code&gt;Generic&lt;&#x2F;code&gt; (chat messages). &lt;code&gt;Subscribe&lt;&#x2F;code&gt; and &lt;code&gt;Unsubscribe&lt;&#x2F;code&gt; seem to be used only for Facebook bots, while &lt;code&gt;Calls&lt;&#x2F;code&gt; are self-explanatory.&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span&gt;messages = &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;list&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;filter&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;lambda &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;d&lt;&#x2F;span&gt;&lt;span&gt;: d[&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;type&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;] in [&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;Share&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;, &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;Generic&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;], messages))
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Let’s inspect those messages and see how many unique fields I have in my data and whether they are always populated. I have to note that this is what I have in my data. It is probably not the full list of all possible fields and YMMV.&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span&gt;fields = &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;list&lt;&#x2F;span&gt;&lt;span&gt;({field &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;message &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;messages &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;field &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;message})
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# There is definitely a one-line list comprehension that can replace this snippet
&lt;&#x2F;span&gt;&lt;span&gt;counter = dict.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;fromkeys&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;list&lt;&#x2F;span&gt;&lt;span&gt;({field &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;message &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;messages &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;field &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;message}), &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0&lt;&#x2F;span&gt;&lt;span&gt;)
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;f &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;counter:
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;m &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;messages:
&lt;&#x2F;span&gt;&lt;span&gt;        &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;if &lt;&#x2F;span&gt;&lt;span&gt;f in m:
&lt;&#x2F;span&gt;&lt;span&gt;            counter[f] += &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;&amp;gt;&amp;gt; {f: counter[f] == &lt;&#x2F;span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;len&lt;&#x2F;span&gt;&lt;span&gt;(messages) &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;f &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;counter}
&lt;&#x2F;span&gt;&lt;span&gt;{
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;files&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;: &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;False&lt;&#x2F;span&gt;&lt;span&gt;, 
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;content&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;: &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;False&lt;&#x2F;span&gt;&lt;span&gt;, 
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;timestamp_ms&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;: &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;True&lt;&#x2F;span&gt;&lt;span&gt;, 
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;videos&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;: &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;False&lt;&#x2F;span&gt;&lt;span&gt;, 
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;gifs&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;: &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;False&lt;&#x2F;span&gt;&lt;span&gt;, 
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;photos&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;: &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;False&lt;&#x2F;span&gt;&lt;span&gt;, 
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;reactions&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;: &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;False&lt;&#x2F;span&gt;&lt;span&gt;, 
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;audio_files&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;: &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;False&lt;&#x2F;span&gt;&lt;span&gt;, 
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;type&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;: &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;True&lt;&#x2F;span&gt;&lt;span&gt;, 
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;is_unsent&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;: &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;True&lt;&#x2F;span&gt;&lt;span&gt;, 
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;ip&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;: &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;False&lt;&#x2F;span&gt;&lt;span&gt;, 
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;sticker&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;: &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;False&lt;&#x2F;span&gt;&lt;span&gt;, 
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;sender_name&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;: &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;True&lt;&#x2F;span&gt;&lt;span&gt;, 
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;share&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;: &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;False
&lt;&#x2F;span&gt;&lt;span&gt;}
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;It seems that fields &lt;code&gt;timestamp_ms&lt;&#x2F;code&gt;, &lt;code&gt;is_unsent&lt;&#x2F;code&gt;, &lt;code&gt;sender_name&lt;&#x2F;code&gt; and &lt;code&gt;type&lt;&#x2F;code&gt;  are the only ones that are populated all messages.&lt;&#x2F;p&gt;
&lt;p&gt;Now, let’s filter by time range, for example let’s filter out messages that are are not from 2020. Note, Facebook stores timestamps in milliseconds.&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;time
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;time_from = time.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;mktime&lt;&#x2F;span&gt;&lt;span&gt;(time.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;strptime&lt;&#x2F;span&gt;&lt;span&gt;(&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;01&#x2F;01&#x2F;2020&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;, &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;%d&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;%m&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;%Y&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;)) * &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1000
&lt;&#x2F;span&gt;&lt;span&gt;time_to = time.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;mktime&lt;&#x2F;span&gt;&lt;span&gt;(time.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;strptime&lt;&#x2F;span&gt;&lt;span&gt;(&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;01&#x2F;01&#x2F;2021&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;, &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;%d&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;%m&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;%Y&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;)) * &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1000
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;messages = &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;list&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;filter&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;lambda &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;d&lt;&#x2F;span&gt;&lt;span&gt;: (d[&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;timestamp_ms&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;] &amp;gt; time_from) and (d[&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;timestamp_ms&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;] &amp;lt; time_to), messages))
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;h2 id=&quot;messaging-heatmap&quot;&gt;Messaging heatmap&lt;&#x2F;h2&gt;
&lt;p&gt;Now that we have only the messages we need for the visualization, let’s recreate the graph that I mentioned before with &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;pycairo.readthedocs.io&#x2F;en&#x2F;latest&#x2F;&quot;&gt;Pycairo&lt;&#x2F;a&gt;, Python bindings for cairo graphics library. The API is pretty similar to D3.js and quite pythonic (I think I will use Pycairo again in the future!). The toughest part was basic trigonometry for finding correct coordinates on the circle and tweaking sizes and offsets to make it look better.&lt;&#x2F;p&gt;
&lt;p&gt;In short, the algorithm for drawing is following:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;Draw background circle&lt;&#x2F;li&gt;
&lt;li&gt;Draw inner circles, month delimiters and month names&lt;&#x2F;li&gt;
&lt;li&gt;Draw inner text&lt;&#x2F;li&gt;
&lt;li&gt;For each message:
&lt;ol&gt;
&lt;li&gt;Convert message’s timestamp to the day of the year and a fraction of the day&lt;&#x2F;li&gt;
&lt;li&gt;Draw a small semi-transparent stroke on the circle in the appropriate day of the year and time of the day&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;


&lt;a href=&quot;https:&#x2F;&#x2F;gubins.lv&#x2F;posts&#x2F;messenger&#x2F;2020_messenger_heatmap.png&quot; target=&quot;_blank&quot;&gt;

	&lt;img src=&quot;https:&amp;#x2F;&amp;#x2F;gubins.lv&amp;#x2F;processed_images&amp;#x2F;2020_messenger_heatmap.016ce269d38437d1.png&quot; class=&quot;is-center image-center&quot; &#x2F;&gt;

&lt;&#x2F;a&gt;
&lt;p&gt;2020 was a very unusual year for all of us. I was lucky enough to (happily) get stuck in Japan with my girlfriend, essentially maxing out tourist visa and staying there for almost 6 months. This is clearly visible on the heatmap: I left for Japan mid-March and returned back home mid-August.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;analyzing-messages-data&quot;&gt;Analyzing messages data&lt;&#x2F;h2&gt;
&lt;p&gt;While my initial goal was to recreate the heatmap, I have decided to try to find some other information in the messages data.&lt;&#x2F;p&gt;
&lt;p&gt;Unique people you have chatted with:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;len&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;set&lt;&#x2F;span&gt;&lt;span&gt;([m[&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;sender_name&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;] &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;m &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;messages]))
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;How many unsent messages you have in your chats:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;len&lt;&#x2F;span&gt;&lt;span&gt;([m &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;m &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;messages &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;if &lt;&#x2F;span&gt;&lt;span&gt;m[&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;is_unsent&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;]])
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;How many reactions you and your friends have put:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;len&lt;&#x2F;span&gt;&lt;span&gt;([m &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;m &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;messages &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;if &lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;reactions&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39; in m])
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;h3 id=&quot;message-content&quot;&gt;Message content&lt;&#x2F;h3&gt;
&lt;p&gt;Since the content of the message is also included, you can even further and analyze the messages themselves. For example, we can count occurences of every word:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;from &lt;&#x2F;span&gt;&lt;span&gt;collections &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;Counter
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;string
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;c = &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;Counter&lt;&#x2F;span&gt;&lt;span&gt;()
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;m &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;messages:
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;if &lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;content&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39; in m:
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;        &lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# remove punctuation
&lt;&#x2F;span&gt;&lt;span&gt;        content = m[&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;content&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;].&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;translate&lt;&#x2F;span&gt;&lt;span&gt;(str.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;maketrans&lt;&#x2F;span&gt;&lt;span&gt;(&amp;#39;&amp;#39;, &amp;#39;&amp;#39;, string.punctuation)) 
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;        &lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# count occurences in lowercase
&lt;&#x2F;span&gt;&lt;span&gt;        &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;w &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;content.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;split&lt;&#x2F;span&gt;&lt;span&gt;():
&lt;&#x2F;span&gt;&lt;span&gt;            c[w.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;lower&lt;&#x2F;span&gt;&lt;span&gt;()] += &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1  
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Later, we can make a bar plot of 25 most common words:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;matplotlib.pyplot &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;as &lt;&#x2F;span&gt;&lt;span&gt;plt 
&lt;&#x2F;span&gt;&lt;span&gt;mc = c.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;most_common&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;25&lt;&#x2F;span&gt;&lt;span&gt;)
&lt;&#x2F;span&gt;&lt;span&gt;plt.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;bar&lt;&#x2F;span&gt;&lt;span&gt;([x[&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0&lt;&#x2F;span&gt;&lt;span&gt;] &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;x &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;mc], [x[&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;&#x2F;span&gt;&lt;span&gt;] &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;x &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;mc])
&lt;&#x2F;span&gt;&lt;span&gt;plt.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;show&lt;&#x2F;span&gt;&lt;span&gt;()
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;It doesn’t really follow &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Zipf%27s_law&quot;&gt;Zipf’s law&lt;&#x2F;a&gt;, which states that frequency of any word is inversely proportional to its rank, but it is reminiscent of a log graph and heavy-tailed as expected.&lt;&#x2F;p&gt;


	&lt;img src=&quot;https:&amp;#x2F;&amp;#x2F;gubins.lv&amp;#x2F;processed_images&amp;#x2F;occurences.2f9ae13e21f80cd7.png&quot; class=&quot;is-center image-center&quot; &#x2F;&gt;
&lt;h3 id=&quot;location-data&quot;&gt;Location data&lt;&#x2F;h3&gt;
&lt;p&gt;Another interesting discovery is that Facebook seem to save IP addresses from where I logged in to Messenger. According to &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.facebook.com&#x2F;help&#x2F;930396167085762#table2&quot;&gt;DYI their info page&lt;&#x2F;a&gt;, they call it “Your recent message activity from specific IP addresses”.&lt;&#x2F;p&gt;
&lt;p&gt;How many unique IPs facebook has seen you message from:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;set&lt;&#x2F;span&gt;&lt;span&gt;([m[&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;ip&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;] &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;for &lt;&#x2F;span&gt;&lt;span&gt;m &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;in &lt;&#x2F;span&gt;&lt;span&gt;messages &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;if &lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;ip&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39; in m])
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Going further, you can find approximate location of each of the IP address with tools like &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;ipinfo.io&quot;&gt;ipinfo.io&lt;&#x2F;a&gt;, which recently launched bulk upload tool. Plotting those locations on the world map, you get something like this:&lt;&#x2F;p&gt;


	&lt;img src=&quot;https:&amp;#x2F;&amp;#x2F;gubins.lv&amp;#x2F;processed_images&amp;#x2F;logins.5d38ec1643c8f64a.png&quot; class=&quot;is-center image-center&quot; &#x2F;&gt;
&lt;p&gt;– Ilja&lt;&#x2F;p&gt;
</content>
        
    </entry>
    <entry xml:lang="en">
        <title>Jeff Tupper’s quine</title>
        <published>2021-04-05T00:00:00+00:00</published>
        <updated>2021-04-05T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://gubins.lv/posts/tupper/"/>
        <id>https://gubins.lv/posts/tupper/</id>
        
        <content type="html" xml:base="https://gubins.lv/posts/tupper/">&lt;p&gt;Yesterday, a friend sent me Numberphile’s video, &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=_s5RFgd59ao&quot;&gt;“The ‘Everything’ Formula”&lt;&#x2F;a&gt;
&lt;div class=&quot;is-center youtube&quot;&gt;
    &lt;iframe
        src=&quot;https:&#x2F;&#x2F;www.youtube.com&#x2F;embed&#x2F;\_s5RFgd59ao&quot;
        webkitallowfullscreen
        mozallowfullscreen
        allowfullscreen&gt;
    &lt;&#x2F;iframe&gt;
&lt;&#x2F;div&gt;&lt;&#x2F;p&gt;
&lt;p&gt;The video describes a neat formula that acts as a plotter for any arbitrary bitmap of size 106x17. For some reason, it is often referred to as “Tupper’s self-referential formula”:&lt;&#x2F;p&gt;
&lt;p&gt;$$\frac{1}{2} &amp;lt; \left\lfloor \mathrm{mod}\left(\left\lfloor \frac{y}{17} \right\rfloor 2^{-17 \lfloor x \rfloor - \mathrm{mod}\left(\lfloor y\rfloor, 17\right)},2\right)\right\rfloor$$&lt;&#x2F;p&gt;
&lt;p&gt;Jeff Tupper’s &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.dgp.toronto.edu&#x2F;papers&#x2F;jtupper_SIGGRAPH2001.pdf&quot;&gt;original paper&lt;&#x2F;a&gt; indeed contains this formula (Figure 13), but it is listed only as an example of two-dimensional graphing and it is never described as self-referential.
While it is a cool mind-scratcher, it is definitely not self-referential, but a &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Quine_(computing)&quot;&gt;quine&lt;&#x2F;a&gt;. It can produce itself in a different modality, but only when looked at the right place on the infinite x-y plane. I like &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;web.archive.org&#x2F;web&#x2F;20150424181239&#x2F;http:&#x2F;&#x2F;arvindn.livejournal.com&#x2F;132943.html&quot;&gt;Arvind Narayanan’s comparison&lt;&#x2F;a&gt; with &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Infinite_monkey_theorem&quot;&gt;the infinite monkey theorem&lt;&#x2F;a&gt;. Can a monkey hitting keys at random for an infinite amount of time eventually type Shakespeare’s Hamlet? Yes, almost surely. The question is where can we find it in the monkey’s infinite string. Can a function that describes every possible 106x17 (1802!) bitmap somewhere on x-y plane contain some representation of the original function? Sure can, but we need to know where exactly.&lt;&#x2F;p&gt;
&lt;p&gt;There are dozens of articles and blog posts that go in details of &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;shreevatsa.wordpress.com&#x2F;2011&#x2F;04&#x2F;12&#x2F;how-does-tuppers-self-referential-formula-work&#x2F;&quot;&gt;how does it work&lt;&#x2F;a&gt; and some even propose an &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;jtra.cz&#x2F;stuff&#x2F;essays&#x2F;math-self-reference&#x2F;index.html&quot;&gt;actual self-referential formula&lt;&#x2F;a&gt;. The posts I have linked explain the theory behind the formula, and much better than I can. Instead, on this quiet Easter Monday I want to demonstrate how I have implemented both finding where can we find the bitmap we are looking for, as well as plotting this bitmap, all in modern Python.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;plotting-the-formula&quot;&gt;Plotting the formula&lt;&#x2F;h2&gt;
&lt;p&gt;Let’s start with the plotting. The image that we are looking for should be at &lt;code&gt;0 &amp;lt; x &amp;lt; 106&lt;&#x2F;code&gt; and &lt;code&gt;k &amp;lt; y &amp;lt; k + 17&lt;&#x2F;code&gt;. In the original paper, &lt;code&gt;k&lt;&#x2F;code&gt; is defined as&lt;&#x2F;p&gt;
&lt;blockquote&gt;
&lt;p&gt;960 939 379 918 958 884 971 672 962 127 852 754 715 004 339 660 129 306 651 505 519 271 702 802 395 266 424 689 642 842 174 350 718 121 267 153 782 770 623 355 993 237 280 874 144 307 891 325 963 941 337 723 487 857 735 749 823 926 629 715 517 173 716 995 165 232 890 538 221 612 403 238 855 866 184 013 235 585 136 048 828 693 337 902 491 454 229 288 667 081 096 184 496 091 705 183 454 067 827 731 551 705 405 381 627 380 967 602 565 625 016 981 482 083 418 783 163 849 115 590 225 610 003 652 351 370 343 874 461 848 378 737 238 198 224 849 863 465 033 159 410 054 974 700 593 138 339 226 497 249 461 751 545 728 366 702 369 745 461 014 655 997 933 798 537 483 143 786 841 806 593 422 227 898 388 722 980 000 748 404 719&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;p&gt;A straight forward way to plot this is to evaluate the function for each x,y point at this separately. Instead, we can vectorize computations with NumPy and use Pillow for loading and saving images:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;numpy &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;as &lt;&#x2F;span&gt;&lt;span&gt;np
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;from &lt;&#x2F;span&gt;&lt;span&gt;PIL &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;Image
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;K = &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;... &lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# omitted for readability
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# build a meshgrid for X coordinates from 0 to 106 and Y coordinates from K to K+17
&lt;&#x2F;span&gt;&lt;span&gt;x, y = np.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;meshgrid&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;range&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;106&lt;&#x2F;span&gt;&lt;span&gt;), &lt;&#x2F;span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;range&lt;&#x2F;span&gt;&lt;span&gt;(K, K+&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;17&lt;&#x2F;span&gt;&lt;span&gt;))
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# evaluate Tupper&amp;#39;s formula for all points at once
&lt;&#x2F;span&gt;&lt;span&gt;result = &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0.5 &lt;&#x2F;span&gt;&lt;span&gt;&amp;lt; ((y &#x2F;&#x2F; &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;17&lt;&#x2F;span&gt;&lt;span&gt;) &#x2F;&#x2F; (&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2 &lt;&#x2F;span&gt;&lt;span&gt;** (&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;17 &lt;&#x2F;span&gt;&lt;span&gt;* x + y % &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;17&lt;&#x2F;span&gt;&lt;span&gt;))) % &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# in my implementation we need to mirror the image due to coordinates mismatch
&lt;&#x2F;span&gt;&lt;span&gt;result = np.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;fliplr&lt;&#x2F;span&gt;&lt;span&gt;(result)
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# convert into an image and save to disk
&lt;&#x2F;span&gt;&lt;span&gt;Image.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;fromarray&lt;&#x2F;span&gt;&lt;span&gt;(result).&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;save&lt;&#x2F;span&gt;&lt;span&gt;(&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;original.png&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;)
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Here’s the “self-replicating” result: &lt;img src=&quot;https:&#x2F;&#x2F;gubins.lv&#x2F;posts&#x2F;tupper&#x2F;original.png&quot; alt=&quot;&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;p&gt;We have confirmed that the approach works, now let’s see how we can find more interesting bitmaps!&lt;&#x2F;p&gt;
&lt;h2 id=&quot;finding-k-for-arbitrary-images&quot;&gt;Finding k for arbitrary images&lt;&#x2F;h2&gt;
&lt;p&gt;The algorithm is quite simple:&lt;&#x2F;p&gt;
&lt;ol&gt;
&lt;li&gt;Convert the 106x17 image of interest into a thresholded binary image where each pixel has value 0 or 1.&lt;&#x2F;li&gt;
&lt;li&gt;Flatten the image column-wise into a single binary number that consists of 106*17=1802 ones or zeros.&lt;&#x2F;li&gt;
&lt;li&gt;Convert this number into base-10 and multiply by 17.&lt;&#x2F;li&gt;
&lt;&#x2F;ol&gt;
&lt;p&gt;Let’s translate this into code. Again, we’ll use NumPy and PIL.&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;numpy &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;as &lt;&#x2F;span&gt;&lt;span&gt;np
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;from &lt;&#x2F;span&gt;&lt;span&gt;PIL &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;Image
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# load the image and convert it to a numpy array
&lt;&#x2F;span&gt;&lt;span&gt;image = np.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;array&lt;&#x2F;span&gt;&lt;span&gt;(Image.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;open&lt;&#x2F;span&gt;&lt;span&gt;(path_to_image))
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# threshold the image: convert the array to binary and then back to int
&lt;&#x2F;span&gt;&lt;span&gt;image = image.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;astype&lt;&#x2F;span&gt;&lt;span&gt;(bool).&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;astype&lt;&#x2F;span&gt;&lt;span&gt;(int)
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# transpose (image must be encoded column-wise), mirror (to match original coordinates frame) and flatten
&lt;&#x2F;span&gt;&lt;span&gt;image = np.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;fliplr&lt;&#x2F;span&gt;&lt;span&gt;(image.T).&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;flatten&lt;&#x2F;span&gt;&lt;span&gt;()
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# represent array as one integer in base 10 and multiply it by 17
&lt;&#x2F;span&gt;&lt;span&gt;K = &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;int&lt;&#x2F;span&gt;&lt;span&gt;(&amp;#39;&amp;#39;.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;join&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;map&lt;&#x2F;span&gt;&lt;span&gt;(str, image)), &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;&#x2F;span&gt;&lt;span&gt;) * &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;17
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# for readability let&amp;#39;s split the found number by 3 characters
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;from &lt;&#x2F;span&gt;&lt;span&gt;textwrap &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;wrap
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;print&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;f&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;Found K: &lt;&#x2F;span&gt;&lt;span&gt;{&amp;quot; &amp;quot;.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;join&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;wrap&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;str&lt;&#x2F;span&gt;&lt;span&gt;(K), &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;3&lt;&#x2F;span&gt;&lt;span&gt;))}&amp;#39;)
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Let’s see if this will work for the OG: &lt;img src=&quot;https:&#x2F;&#x2F;gubins.lv&#x2F;posts&#x2F;tupper&#x2F;original.png&quot; alt=&quot;&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;blockquote&gt;
&lt;p&gt;Found K: 960 939 379 918 958 884 971 672 962 127 852 754 715 004 339 660 129 306 651 505 519 271 702 802 395 266 424 689 642 842 174 350 718 121 267 153 782 770 623 355 993 237 280 874 144 307 891 325 963 941 337 723 487 857 735 749 823 926 629 715 517 173 716 995 165 232 890 538 221 612 403 238 855 866 184 013 235 585 136 048 828 693 337 902 491 454 229 288 667 081 096 184 496 091 705 183 454 067 827 731 551 705 405 381 627 380 967 602 565 625 016 981 482 083 418 783 163 849 115 590 225 610 003 652 351 370 343 874 461 848 378 737 238 198 224 849 863 465 033 159 410 054 974 700 593 138 339 226 497 249 461 751 545 728 366 702 369 745 461 014 655 997 933 798 537 483 143 786 841 806 593 422 227 898 388 722 980 000 748 404 719&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;p&gt;That’s exactly the same string as above, yay! Time to find find some interesting k’s.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;other-k&quot;&gt;Other k&lt;&#x2F;h2&gt;
&lt;p&gt;As a tribute to the author, let’s find k of Jeff Tupper’s name: &lt;img src=&quot;https:&#x2F;&#x2F;gubins.lv&#x2F;posts&#x2F;tupper&#x2F;jeff_tupper.png&quot; alt=&quot;&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;blockquote&gt;
&lt;p&gt;Found K: 385 815 185 034 138 158 985 870 922 709 187 771 741 984 863 903 001 560 903 755 944 700 527 424 584 051 315 029 117 989 906 439 493 180 632 249 939 860 962 369 068 130 333 226 454 308 011 056 898 770 189 860 026 183 100 209 457 279 325 986 175 729 482 270 868 521 087 312 425 053 904 627 001 175 969 513 613 540 074 140 325 118 540 773 528 302 948 665 290 791 255 891 024 244 612 978 717 349 953 081 144 003 722 839 141 078 743 527 892 814 519 951 422 463 077 169 240 686 548 347 390 436 099 422 493 685 020 088 219 659 235 193 486 350 953 092 824 772 778 233 932 879 750 089 176 444 321 163 669 351 665 248 816 840 127 309 058 081 171 960 635 932 943 715 857 693 920 045 698 334 277 341 544 448&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;p&gt;Mustering all my artistic skill, I managed to draw something easily recognizable even on such a small image: &lt;img src=&quot;https:&#x2F;&#x2F;gubins.lv&#x2F;posts&#x2F;tupper&#x2F;smiley.png&quot; alt=&quot;&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;blockquote&gt;
&lt;p&gt;Found K: 126 142 104 213 220 484 299 803 993 442 062 205 409 077 505 220 633 202 015 858 972 307 182 501 421 645 782 823 217 849 945 196 989 219 566 869 642 586 856 002 258 241 479 301 667 237 018 424 945 404 030 790 228 025 510 765 380 030 265 144 252 981 546 142 589 226 719 086 965 652 739 520 845 129 864 851 583 864 577 815 519 135 290 434 195 548 911 229 646 938 854 682 829 338 643 067 177 226 847 816 952 758 734 166 379 981 459 848 756 683 277 095 647 000 957 329 360 499 307 384 746 219 787 041 087 624 524 570 154 355 080 970 812 831 032 393 680 420 031 478 481 534 582 784&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;p&gt;We can go futher and add support for RGB(A) images by grayscaling and thresholding before converting image to a number, allowing us to find &lt;code&gt;k&lt;&#x2F;code&gt; of some fun things.&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;if &lt;&#x2F;span&gt;&lt;span&gt;image.ndim == &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;3&lt;&#x2F;span&gt;&lt;span&gt;:
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# thresholding value was chosen experimentally ¯\_(ツ)_&#x2F;¯
&lt;&#x2F;span&gt;&lt;span&gt;    image = image[&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;...&lt;&#x2F;span&gt;&lt;span&gt;, :&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;3&lt;&#x2F;span&gt;&lt;span&gt;].&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;mean&lt;&#x2F;span&gt;&lt;span&gt;(-&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;&#x2F;span&gt;&lt;span&gt;) &amp;gt; (image.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;mean&lt;&#x2F;span&gt;&lt;span&gt;() &#x2F; &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1.5&lt;&#x2F;span&gt;&lt;span&gt;)  
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;For example, we can try to find k of &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;knowyourmeme.com&#x2F;memes&#x2F;woman-yelling-at-a-cat&quot;&gt;a cat who is being yelled at by a woman&lt;&#x2F;a&gt;: &lt;img src=&quot;https:&#x2F;&#x2F;gubins.lv&#x2F;posts&#x2F;tupper&#x2F;cat.png&quot; alt=&quot;&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;blockquote&gt;
&lt;p&gt;Found K: 851 030 412 628 336 556 165 809 369 962 109 912 804 161 242 368 430 933 212 288 014 494 536 807 542 740 734 381 671 364 065 155 879 388 013 869 494 956 962 061 134 763 310 263 193 385 575 072 149 137 089 604 705 201 819 322 847 554 746 397 401 554 468 442 840 234 615 665 452 154 695 967 266 451 351 629 936 505 011 951 742 325 309 194 431 147 721 355 712 298 208 733 534 626 570 288 417 823 332 710 025 407 637 465 753 296 616 837 452 115 490 622 693 556 930 112 201 054 318 590 446 660 494 224 496 302 101 506 150 764 129 717 619 346 698 940 226 775 371 940 997 279 460 630 181 000 578 484 533 391 638 594 731 484 579 785 972 285 338 008 247 656 925 936 746 496&lt;&#x2F;p&gt;
&lt;p&gt;Result: &lt;img src=&quot;https:&#x2F;&#x2F;gubins.lv&#x2F;posts&#x2F;tupper&#x2F;tupper_cat.png&quot; alt=&quot;&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;p&gt;Or a surprised pikachu: &lt;img src=&quot;https:&#x2F;&#x2F;gubins.lv&#x2F;posts&#x2F;tupper&#x2F;pikachu.png&quot; alt=&quot;&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;blockquote&gt;
&lt;p&gt;Found K: 100 471 937 400 773 042 105 488 795 487 079 558 956 078 370 881 471 858 986 932 192 035 983 663 503 559 690 104 085 241 627 816 344 060 029 535 358 169 730 565 614 348 280 274 258 391 956 093 706 487 773 600 207 321 552 633 572 512 727 519 083 578 975 536 089 240 998 241 402 260 568 596 955 034 368 758 047 736 965 380 237 422 807 289 192 867 659 494 212 761 961 844 085 991 026 277 158 315 733 803 429 051 940 595 481 874 380 651 187 292 245 665 209 511 827 991 598 028 036 167 858 029 884 225 988 105 814 239 390 403 337 695 473 230 329 464 921 830 026 338 051 548 823 510 917 702 632 449 767 686 964 811 819 836 113 753 470 019 547 288 874 741 950 634 085 738 923 194 014 636 840 792 896 8&lt;&#x2F;p&gt;
&lt;p&gt;Result: &lt;img src=&quot;https:&#x2F;&#x2F;gubins.lv&#x2F;posts&#x2F;tupper&#x2F;tupper_pikachu.png&quot; alt=&quot;&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;&#x2F;blockquote&gt;
&lt;p&gt;All in all, that was a fun one day project!&lt;&#x2F;p&gt;
&lt;p&gt;– Ilja&lt;&#x2F;p&gt;
</content>
        
    </entry>
    <entry xml:lang="en">
        <title>Developing a custom thumbnailer for Nautilus</title>
        <published>2021-03-22T00:00:00+00:00</published>
        <updated>2021-03-22T00:00:00+00:00</updated>
        
        <author>
          <name>
            
              Unknown
            
          </name>
        </author>
        
        <link rel="alternate" type="text/html" href="https://gubins.lv/posts/thumbnailer/"/>
        <id>https://gubins.lv/posts/thumbnailer/</id>
        
        <content type="html" xml:base="https://gubins.lv/posts/thumbnailer/">&lt;p&gt;Linux file managers, such as GNOME’s Nautilus (used by default in Pop OS and Ubuntu), display thumbnails for files instead of generic icons. For example, a “screenshot” of the first page of the PDF document or the first frame of a GIF. A thumbnailer is a program that given an input file generates such representation. I work a lot with a 3D imaging format called MRC2014, as published by &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;http:&#x2F;&#x2F;www.sciencedirect.com&#x2F;science&#x2F;article&#x2F;pii&#x2F;S104784771500074X&quot;&gt;Cheng et al. (2015)&lt;&#x2F;a&gt;. It is a common for cryo-electron microscopy and tomography, maintained by &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.ccpem.ac.uk&#x2F;mrc_format&#x2F;mrc_format.php&quot;&gt;CCP-EM&lt;&#x2F;a&gt; on behalf of the EM community. Depending on the binning (downsampling) level of the data, a file can be anywhere from a couple of megabytes to many gigabytes. While scientific packages such as &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;fiji.sc&#x2F;&quot;&gt;ImageJ&#x2F;Fiji&lt;&#x2F;a&gt; or &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.cgl.ucsf.edu&#x2F;chimera&#x2F;&quot;&gt;UCSF Chimera&lt;&#x2F;a&gt; can open the files, it is still a custom format without thumbnails. Descriptive file names help, but as the proverb goes, a picture is worth a thousand words. So, one weekend evening I have decided to learn how file managers generate thumbnails and develop one for MRC files.&lt;&#x2F;p&gt;
&lt;p&gt;There is a ton of information about thumbnailers online, but to my surprise, it is very assorted and often outdated. For example, an official GNOME document that explains how to add thumbnailers and comes up on the top of search results is from 2006 and is not applicable since 2011. I have decided to write down the steps I had to do for my thumbnailer in hopes that this will be useful for somebody some day.&lt;&#x2F;p&gt;
&lt;p&gt;&lt;img src=&quot;https:&#x2F;&#x2F;gubins.lv&#x2F;posts&#x2F;thumbnailer&#x2F;mrc-gnome-thumbnailer.png&quot; alt=&quot;Before and after installing MRC thumbnailer&quot; &#x2F;&gt;&lt;&#x2F;p&gt;
&lt;h2 id=&quot;add-mime-type&quot;&gt;Add MIME type&lt;&#x2F;h2&gt;
&lt;p&gt;First, we need to register the filename extension (i.e., &lt;code&gt;.mrc&lt;&#x2F;code&gt;) in a system MIME type database: we create an XML file that describes MIME type with a &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;freedesktop.org&#x2F;wiki&#x2F;Specifications&#x2F;shared-mime-info-spec&#x2F;&quot;&gt;shared-mime-info schema&lt;&#x2F;a&gt;, add this file with the rest and update the database. It is not necessary to fill all the elements, and for my case it just enough to specify mime type, name, generic icon (shown before thumbnail is generated or if generation has failed) and filename extension matcher. Type should be unique on the system and there can be multiple extension matchers, i.e. lowercase and uppercase.&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;xml&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-xml &quot;&gt;&lt;code class=&quot;language-xml&quot; data-lang=&quot;xml&quot;&gt;&lt;span&gt;&amp;lt;?&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;xml &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;version&lt;&#x2F;span&gt;&lt;span&gt;=&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;1.0&lt;&#x2F;span&gt;&lt;span&gt;&amp;quot; &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;encoding&lt;&#x2F;span&gt;&lt;span&gt;=&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;UTF-8&lt;&#x2F;span&gt;&lt;span&gt;&amp;quot;?&amp;gt;
&lt;&#x2F;span&gt;&lt;span&gt;&amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;mime-info &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;xmlns&lt;&#x2F;span&gt;&lt;span&gt;=&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;http:&#x2F;&#x2F;www.freedesktop.org&#x2F;standards&#x2F;shared-mime-info&lt;&#x2F;span&gt;&lt;span&gt;&amp;quot;&amp;gt;
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;mime-type &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;type&lt;&#x2F;span&gt;&lt;span&gt;=&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;image&#x2F;mrc&lt;&#x2F;span&gt;&lt;span&gt;&amp;quot;&amp;gt;
&lt;&#x2F;span&gt;&lt;span&gt;        &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;comment&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;MRC2014&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;comment&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;
&lt;&#x2F;span&gt;&lt;span&gt;        &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;generic-icon &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;name&lt;&#x2F;span&gt;&lt;span&gt;=&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;image-x-generic&lt;&#x2F;span&gt;&lt;span&gt;&amp;quot;&#x2F;&amp;gt;
&lt;&#x2F;span&gt;&lt;span&gt;        &amp;lt;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;glob &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;pattern&lt;&#x2F;span&gt;&lt;span&gt;=&amp;quot;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;*.mrc&lt;&#x2F;span&gt;&lt;span&gt;&amp;quot;&#x2F;&amp;gt;
&lt;&#x2F;span&gt;&lt;span&gt;    &amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;mime-type&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;
&lt;&#x2F;span&gt;&lt;span&gt;&amp;lt;&#x2F;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;mime-info&lt;&#x2F;span&gt;&lt;span&gt;&amp;gt;
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;Copy this file to &lt;code&gt;&#x2F;usr&#x2F;share&#x2F;mime&#x2F;packages&#x2F;&lt;&#x2F;code&gt; and run &lt;code&gt;sudo update-mime-database &#x2F;usr&#x2F;share&#x2F;mime&#x2F;packages&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;thumbnailing-script&quot;&gt;Thumbnailing script&lt;&#x2F;h2&gt;
&lt;p&gt;Now, we need to write a program that would actually generate a thumbnail image. The program must accept three arguments: path to the input file, path to the output file and thumbnail image size (the thumbnail is square).&lt;&#x2F;p&gt;
&lt;p&gt;For my usecase, I decided to write a Python script that loads the central slice of a tomogram, rescales the intensities to 0-255 to show as an image, downsizes to the thumbnail size and saves it to the given output path. MRC format itself is &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;www.ccpem.ac.uk&#x2F;mrc_format&#x2F;mrc2014.php&quot;&gt;pretty complex&lt;&#x2F;a&gt;, with an extensive header that stores various metadata. Luckily, CCP-EM maintains a very convenient Python library that makes loading the files a breeze, &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;ccpem&#x2F;mrcfile&quot;&gt;mrcfile&lt;&#x2F;a&gt;. Together with NumPy and Pillow, the script looks something like that:&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;python&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-python &quot;&gt;&lt;code class=&quot;language-python&quot; data-lang=&quot;python&quot;&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;#!&#x2F;usr&#x2F;bin&#x2F;python3
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# -*- coding: utf-8 -*-
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;numpy &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;as &lt;&#x2F;span&gt;&lt;span&gt;np
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;sys
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;mrcfile
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;argparse
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;from &lt;&#x2F;span&gt;&lt;span&gt;PIL &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;Image
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;import &lt;&#x2F;span&gt;&lt;span&gt;warnings
&lt;&#x2F;span&gt;&lt;span&gt;warnings.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;simplefilter&lt;&#x2F;span&gt;&lt;span&gt;(&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;ignore&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;)
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# Intensity rescaling helpers
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# Shamelessly adapted from scikit-image
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;def &lt;&#x2F;span&gt;&lt;span style=&quot;color:#8fa1b3;&quot;&gt;rescale&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;image&lt;&#x2F;span&gt;&lt;span&gt;, &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;in_range&lt;&#x2F;span&gt;&lt;span&gt;) -&amp;gt; np.ndarray:
&lt;&#x2F;span&gt;&lt;span&gt;    imin, imax = in_range
&lt;&#x2F;span&gt;&lt;span&gt;    image = np.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;clip&lt;&#x2F;span&gt;&lt;span&gt;(image, imin, imax)
&lt;&#x2F;span&gt;&lt;span&gt;    image = (image - imin) &#x2F; (imax - imin)
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;return &lt;&#x2F;span&gt;&lt;span&gt;np.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;asarray&lt;&#x2F;span&gt;&lt;span&gt;(image * &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;255.&lt;&#x2F;span&gt;&lt;span&gt;, &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;dtype&lt;&#x2F;span&gt;&lt;span&gt;=np.uint8)
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;if &lt;&#x2F;span&gt;&lt;span&gt;__name__ == &amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;__main__&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;:
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# parse arguments
&lt;&#x2F;span&gt;&lt;span&gt;    parser = argparse.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;ArgumentParser&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;description&lt;&#x2F;span&gt;&lt;span&gt;=&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;Generate thumbnails for .mrc images&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;)
&lt;&#x2F;span&gt;&lt;span&gt;    parser.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;add_argument&lt;&#x2F;span&gt;&lt;span&gt;(&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;input&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;,    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;type&lt;&#x2F;span&gt;&lt;span&gt;=str,   &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;help&lt;&#x2F;span&gt;&lt;span&gt;=&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;Path to input mrc file&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;)
&lt;&#x2F;span&gt;&lt;span&gt;    parser.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;add_argument&lt;&#x2F;span&gt;&lt;span&gt;(&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;output&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;,   &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;type&lt;&#x2F;span&gt;&lt;span&gt;=str,   &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;help&lt;&#x2F;span&gt;&lt;span&gt;=&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;Path to output png thumbnail&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;)
&lt;&#x2F;span&gt;&lt;span&gt;    parser.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;add_argument&lt;&#x2F;span&gt;&lt;span&gt;(&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;size&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;,     &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;type&lt;&#x2F;span&gt;&lt;span&gt;=int,   &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;help&lt;&#x2F;span&gt;&lt;span&gt;=&amp;#39;&lt;&#x2F;span&gt;&lt;span style=&quot;color:#a3be8c;&quot;&gt;Thumbnail image size&lt;&#x2F;span&gt;&lt;span&gt;&amp;#39;)
&lt;&#x2F;span&gt;&lt;span&gt;    arguments = parser.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;parse_args&lt;&#x2F;span&gt;&lt;span&gt;()
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# mmap the input file
&lt;&#x2F;span&gt;&lt;span&gt;    f = mrcfile.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;mmap&lt;&#x2F;span&gt;&lt;span&gt;(arguments.input, &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;permissive&lt;&#x2F;span&gt;&lt;span&gt;=&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;True&lt;&#x2F;span&gt;&lt;span&gt;)
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# extract central slice
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;if &lt;&#x2F;span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;len&lt;&#x2F;span&gt;&lt;span&gt;(f.data.shape) == &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;&#x2F;span&gt;&lt;span&gt;:
&lt;&#x2F;span&gt;&lt;span&gt;        data = f.data
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;elif &lt;&#x2F;span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;len&lt;&#x2F;span&gt;&lt;span&gt;(f.data.shape) == &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;3&lt;&#x2F;span&gt;&lt;span&gt;:
&lt;&#x2F;span&gt;&lt;span&gt;        data = f.data[f.data.shape[&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0&lt;&#x2F;span&gt;&lt;span&gt;] &#x2F;&#x2F; &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;&#x2F;span&gt;&lt;span&gt;]
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#b48ead;&quot;&gt;else&lt;&#x2F;span&gt;&lt;span&gt;:
&lt;&#x2F;span&gt;&lt;span&gt;        f.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;close&lt;&#x2F;span&gt;&lt;span&gt;()
&lt;&#x2F;span&gt;&lt;span&gt;        sys.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;exit&lt;&#x2F;span&gt;&lt;span&gt;(-&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;&#x2F;span&gt;&lt;span&gt;)
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;    f.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;close&lt;&#x2F;span&gt;&lt;span&gt;()
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# contrast stretching to 2-98 percentile
&lt;&#x2F;span&gt;&lt;span&gt;    p2, p98 = np.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;percentile&lt;&#x2F;span&gt;&lt;span&gt;(data, (&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;&#x2F;span&gt;&lt;span&gt;, &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;98&lt;&#x2F;span&gt;&lt;span&gt;))
&lt;&#x2F;span&gt;&lt;span&gt;    data = &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;rescale&lt;&#x2F;span&gt;&lt;span&gt;(data, &lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;in_range&lt;&#x2F;span&gt;&lt;span&gt;=(p2, p98))
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# rotate 180 deg and flip to be consistent with imagej&#x2F;fiji
&lt;&#x2F;span&gt;&lt;span&gt;    data = np.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;rot90&lt;&#x2F;span&gt;&lt;span&gt;(data, &lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;&#x2F;span&gt;&lt;span&gt;)
&lt;&#x2F;span&gt;&lt;span&gt;    data = np.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;fliplr&lt;&#x2F;span&gt;&lt;span&gt;(data)
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# convert to PIL
&lt;&#x2F;span&gt;&lt;span&gt;    img = Image.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;fromarray&lt;&#x2F;span&gt;&lt;span&gt;(data)
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# resize to the thumbnail size
&lt;&#x2F;span&gt;&lt;span&gt;    img.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;thumbnail&lt;&#x2F;span&gt;&lt;span&gt;((arguments.size, arguments.size))
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;    &lt;&#x2F;span&gt;&lt;span style=&quot;color:#65737e;&quot;&gt;# save image
&lt;&#x2F;span&gt;&lt;span&gt;    img.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;save&lt;&#x2F;span&gt;&lt;span&gt;(arguments.output)
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;span&gt;    sys.&lt;&#x2F;span&gt;&lt;span style=&quot;color:#bf616a;&quot;&gt;exit&lt;&#x2F;span&gt;&lt;span&gt;(&lt;&#x2F;span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0&lt;&#x2F;span&gt;&lt;span&gt;)
&lt;&#x2F;span&gt;&lt;span&gt;
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;The script must be runnable (&lt;code&gt;sudo chmod a+rx&lt;&#x2F;code&gt;) and be in PATH, i.e. &lt;code&gt;&#x2F;usr&#x2F;bin&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;thumbnailer&quot;&gt;Thumbnailer&lt;&#x2F;h2&gt;
&lt;p&gt;&lt;code&gt;.thumbnailer&lt;&#x2F;code&gt; files link together MIME types and thumbnailer programs. The file must be copied to &lt;code&gt;&#x2F;usr&#x2F;share&#x2F;thumbnailers&#x2F;&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;pre data-lang=&quot;txt&quot; style=&quot;background-color:#2b303b;color:#c0c5ce;&quot; class=&quot;language-txt &quot;&gt;&lt;code class=&quot;language-txt&quot; data-lang=&quot;txt&quot;&gt;&lt;span&gt;[Thumbnailer Entry]
&lt;&#x2F;span&gt;&lt;span&gt;TryExec=&#x2F;usr&#x2F;bin&#x2F;mrc-thumbnailer
&lt;&#x2F;span&gt;&lt;span&gt;Exec=&#x2F;usr&#x2F;bin&#x2F;mrc-thumbnailer %i %o %s
&lt;&#x2F;span&gt;&lt;span&gt;MimeType=image&#x2F;mrc
&lt;&#x2F;span&gt;&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;h2 id=&quot;putting-it-all-together&quot;&gt;Putting it all together&lt;&#x2F;h2&gt;
&lt;p&gt;An important thing to keep in mind is that Nautilus has a file size limit for thumbnailing: anything above a specific threshold will not be passed to the thumbnailing script. You can adjust in Nautilus: go to Preferences -&amp;gt; Search &amp;amp; Preview -&amp;gt; Thumbnails section.&lt;&#x2F;p&gt;
&lt;p&gt;Now that everything is in place, quit all Nautilus processes and delete cached thumbnails: &lt;code&gt;nautilus -q&lt;&#x2F;code&gt; and &lt;code&gt;rm -r ~&#x2F;.cache&#x2F;thumbnails&lt;&#x2F;code&gt;. Open any folder with files of your interest and the thumbnails should be there!&lt;&#x2F;p&gt;
&lt;p&gt;Developing a custom thumbnailer turned out to be quite an easy, one evening project and is a neat quality of &lt;del&gt;life&lt;&#x2F;del&gt; work improvement. If you intend to distribute the thumbnailer I suggest using a Makefile to automate the installation. You can find my complete MRC thumbnailer on github: &lt;a rel=&quot;noopener nofollow noreferrer&quot; target=&quot;_blank&quot; href=&quot;https:&#x2F;&#x2F;github.com&#x2F;the-lay&#x2F;mrc-em-gnome-thumbnailer&quot;&gt;the-lay&#x2F;mrc-em-gnome-thumbnailer&lt;&#x2F;a&gt;. &lt;em&gt;December 2021 Update:&lt;&#x2F;em&gt; I have added .em file handling too and renamed the project to mrc-em-gnome-thumbnailer.&lt;&#x2F;p&gt;
&lt;p&gt;– Ilja&lt;&#x2F;p&gt;
</content>
        
    </entry>
</feed>
