Double Negative

Software, code and things.

Android fragments and memory

This post was initially entitled "Android fragments - the important stuff". It will hopefully provide an overview of the less obvious things associated with utilizing fragments in Android.

What is the difference between attach/add and detach/remove?

Detaching a fragment essentially dissociates a fragment from its activity. The FragmentTransaction documentation states that "the fragment is removed from the UI". This is in my opinion confusing - I understand it to mean that the fragments UI is removed from the on-screen UI. The view is destroyed and is not held in memory. The view is destroyed so as to minimize the applications memory footprint.

The documentation also states "This is the same state as when it is put on the back stack" - that is to say the Fragment instance is stored in memory along with its instance variables etc.

Adding/Attaching a fragment is the opposite.

What am I adding to the backstack?

When you programatically add a fragment into a container, you do so using a FragmentTransaction. A fragment transaction is just a set of instructions for what you want to happen - remove fragment A, add fragment B for example.

When you add a transaction to the backstack, you do exactly that.. you add the transaction. The backstack contains FragmentTransactions not Fragments.

When you pop the backstack you essentially say 'do the opposite of this transaction' or 'do the opposite of the last x transactions'.

As such, when you add a transaction to the backstack using addToBackStack(String name) the name has nothing to do with the fragments involved. It is not a fragment tag or identifier. It is an identifier for the state after the transaction in question.

When you want to reverse the transactions, you use the various popBackstack() methods. More information can be found here.

Memory considerations

As commented in this StackOverflow question, "when a fragment is put on the back stack, its onDestroyView() is called. If at that point you clean up any major allocations being held by the fragment, you shouldn't have memory issues". This quote is vital for appreciating the memory considerations of fragments (and remember.. as mentioned above, this is the same as when you detach a fragment).

When you detatch a fragment, the following lifecycle methods are called: onPause(), onStop(), onDestroyView(), onDestroy(), and onDetach(). If for example you had saved a reference to a visible GoogleMap in an instance variable it would be worth destroying this in onDestroyView() and recreating it if/when you attach the fragment once again and its onCreateView() method is called.

What about show/hide?

FragmentTransaction does have two additional methods - show and hide. I added these to my demo so you can see for yourself that these do exactly what they say on the tin.. they show and hide. No lifecycle methods are called, and every aspect of the fragment instance is held in memory (including the view heirachy).

If you create a lot of fragments, hide them, and add new ones on top (why you would do this I do not know), you will run out of memory sooner or later.

If you have a fragment which has an expensive view heirachy.. and you want it to be off screen for 3 seconds.. these methods may be of use.

Trying it out

The demo application (linked above) allows you to play with these various fragment transactions and view the lifecycle in your logcat output.

What you will notice Is that if you 'Add' a fragment, then 'Remove' it, and then try to 'Attach' the same fragment nothing will happen.

This is because whilst you hold a reference to the fragment, when you remove a fragment the FragmentManager gets rid of its reference to it.

As such when you try and attach it, nothing happens.

Below I have outlined the lifecycle of the various approaches to using fragments. This should outline what happens, and why. With this knowledge you can use the most appropriate methods for your use case.

The lifecycle

1. Add

If you press the add button in the demo, these lifecycle methods are called.

We set an instance variable with the current time to demonstrate how state is saved.

onAttach  
onCreate  
onCreateView  
onActivityCreated  
onStart  
onResume  

Pop

If we had 'Add to backstack' checked, and then pressed 'Pop', these lifecycle methods are called.

onPause  
onStop  
onDestroyView  
onDestroy  
onDetach  

2. Remove

If you have added the fragment, and then remove it.. these lifecycle methods are called.

onPause  
onStop  
onDestroyView  
onDestroy  
onDetach  

Pop

Popping this will re-add the view. As you can see the view is recreated (onCreateView is called) - when removed the view is destroyed competely. Even when the transaction is added to the backstack.

You will however see that the system time outputted is the same. This is because when removed, if added to the backstack the fragment is maintained in a state similar to when it is detatched - its instance variables are retained.

If you were to simply add the fragment, remove it, and then press add again, the system time would be different as under this condition the state is not retained. The full lifecycle (as in 1.) would be executed.

onCreateView  
onActivityCreated  
onStart  
onResume  

3. Detach

If you have pressed 'Add', and then click 'Detach' the following lifecycle methods are executed.

onPause  
onStop  
onDestroyView  

Pop

If at this point you had had addToBackstack checked and you now pop you will see the following lifecycle methods called. You will note that the view is recreated and that the system time is the same.

In fact this is the exact same as a 'post-remove-pop'.

onCreateView  
onActivityCreated  
onStart  
onResume  

4. Attach

If you attach a previously detached fragment, the below lifecycle methods are called.

Again you will notice that the system time is maintained.

onCreateView  
onActivityCreated  
onStart  
onResume  

Pop

If you attach a fragment, add it to the backstack and then pop it, the following lifecycle methods are called.

The difference between this, and a 'post-add-pop is that onDestroy and onDetach are not called.

The significance of this is that after popping you could press 'Attach' once again, and you would notice that the system time is the same.

On the other hand when you 'Add' a fragment, and pop it from the backstack it is completely destroyed - its view.. its instance variables.. everything.

onPause  
onStop  
onDestroyView  

Are you having fun?

Because you are almost certainly having fun.. lets continue.

What the hell is setRetainInstance?!

The Fragment documentation states that it controls "whether a fragment instance is retained across Activity re-creation".

You can play with this in the demo by uncommenting the appropriate line in ExampleFragment.

What it essentially amounts to is that if you add a fragment, and then rotate your screen the following lifecycle methods are called:

onPause  
onStop  
onDestroyView  
onDestroy  
onDetach

onAttach  
onCreate  
onCreateView  
onActivityCreated  
onStart  
onResume  

If you hadn't noticed, this happens to be the same lifecycle as 'removing' a fragment, and then 'adding' it. Guess why? Because that is what is happening :)

What you will notice is that the system time output when it is readded is different to when it was initially added the fragment.

However this is expected.. after all we can see from the lifecycle that the fragment is completely destroyed before it is readded.

Now uncomment the setRetainInstance(true) line in the demo code and do the same thing again.

This time you will see the following lifecycle executed:

onPause  
onStop  
onDestroyView  
onDetach  
onAttach  
onCreateView  
onActivityCreated  
onStart  
onResume  

The important difference is that onDestroy and onCreate are not called. The fragment is not destroyed and recreated but rather its view is destroyed, and state maintained.

You still need to rebuild the view (for all the memory related reasons outlined previously) but the fragment state is retained. The output system time is the same as when it was initially created.

If you have written a beautiful sonnet and stored it in an instance variable (who hasn't..) then you will find that it is stil there when you rotate your screen if setRetainInstance is passed a boolean true.

How does the SDK use this?

The Android SDK uses this stuff everywhere. Fragments are after all a very clever design pattern (if you can call them that).

FragmentTabHost and FragmentStatePagerAdapter are the two which instantly spring to mind.

Given the above you can adapt these classes depending on your needs. For example, if you want a tab host which hosts two fragments that have complex view heirachies yet are regularly changed, you might choose to modify the provided implementation. FragmentTabHost uses attach and detach within its doTabChanged implementation - you could change this to simply show/hide your tabs.

FragmentStatePagerAdapter uses add/remove exclusively, as well as an interesting caching mechanism. This is pretty logical really - if you had a lot of fragments in your pager, you certainly wouldn't want to keep all their views in memory at any given time.

Summary

Fragments seem more complex than they due to confusing documentation and terminology. Furthermore because of the large number of combinations of approaches for different use cases it is pretty difficult to visualise and process what happens and why.

Hopefully the demo app will help you get your head around the concepts, and hopefully my explanations make sense.

If anything needs clarifying or you have any questions.. ask away :)

Fragment state restoration and the FragmentStatePagerAdapter

In the process of building an application for Android I encountered an interesting requirement with a non-trivial solution.

I wanted to implement horizontally scrollable 'pages' in such a way that when the user then added a new item, a new page would be added at the start.

The reason that this is not trivial is based on the way that the ViewPager class caches and restores its adapter.

Each page in my example will be a Fragment. As such it seemed appropriate to use the FragmentStatePagerAdapter class provided by Google.

The documentation states "When pages are not visible to the user, their entire fragment may be destroyed, only keeping the saved state of that fragment." - the way this saved state is restored is the cause of the issue. How do you restore the first fragment if the first fragment has changed?

In the ViewPager source (from which FragmentStatePagerAdapter extends) the onRestoreInstanceState method contains the following block:

if (mAdapter != null) {  
            mAdapter.restoreState(ss.adapterState, ss.loader);
             setCurrentItemInternal(ss.position, false, true);
         }

The documentation for the Activity class states that onRestoreInstanceState is called after onStart. It is this consideration which causes my issue.

What I want to do

In my application I display various fragments within a container, and maintain a backstack to allow back navigation between them.

When the fragment containing the FragmentStatePagerAdapter is initially created we set our adapter within onCreateView or onActivityCreated as appropriate.

We then click our 'Add' button in our toolbar and the fragment is replaced within the container with our 'Add new page' fragment. When this happens we add our FragmentTransaction to the backstack. This causes our 'pager' fragment NOT to be destroyed as outlined here.

We complete our 'Add' process and now we want to popBackstack to the initial pager fragment AND add a new page at the start.

We could do this by accessing the pager fragment from our FragmentManager and updating the adapter prior to popping back to it? When we then pop to it and notifyDataSetChanged, our new page should appear.. right? Nope.

When we pop to the pager fragment the view heirachy is automatically restored by android. The code snippet above shows that for a ViewPager the restore process checks if the pager has an adapter (which it does - we set it in onCreateView) and if it does uses the saved adapter state. The saved adapter state is the adapter state without our new page. This results in various weird things such as the correct number of pages but the incorrect fragment contents.

This happens because FragmentStatePagerAdapter caches its own contained fragments, and when we call restoreState (in the above snippet), these fragments are reused.

Interestingly, if you swipe to the right a few times, and swipe back then you will see the correct new Fragment. Why is this? Well.. as emphasized by the quote above, the FragmentStatePagerAdapter destroys its fragments as appropriate when they are offscreen. When offscreen the cached page one is destroyed. When we scroll back to it it is recreated. This time however it uses the correct new fragment at position one.

Resolution

There are many potential resolutions for this particular use case.

  • You could create a custom ViewPager and reimplement onRestoreInstanceState so as not to restore the pager in the way outlined above.

  • You could unset the adapter before moving to the 'add' fragment - this way, when you return the various restoration conditions triggered by the existance of an adapter will still be called but their wont be any data to reuse.

  • You could create a custom FragmentStatePagerAdapter and add a cache clearing mechanism for such a use case.

I went for the final option, so I'll explain a little more how I approached it.

The FragmentStatePagerAdapter source is open source. It is not superbly complex and as such in my opinion is perfectly suitable for reimplementing as long as you know what you are doing.

I implemented the following snippet within my custom adapter:

Boolean clearCache = false;

    public void clearCache() {
        mFragments.clear();
        clearCache = true;
    }

What this does is it clears the cached list of fragments in the adapter. It then sets a Boolean indicating this.

The reason for this Boolean is that in restoreState, the adapter will try and restore mFragments by getting the fragments from the saved bundle. In this case we do not want to restore them and as such we change the line if (state != null) { to if (state != null && !clearCache) { in onRestoreState and reset the clearCache variable to false at the end.

Now when the adapter goes to access a fragment at a particular position, due to its non-existance in the cache it will create it. Our new page will be displayed correctly.

This approach has the benefit that you can conditionally destroy the cache. In normal usage when a user scrolls between fragments the caching mechanisms will work as normal. When however you add a new page you can invalidate the cache.

Other resolutions to similar problems are not appropriate for working with large adapters. This approach should however be quite performant. If you have one hundred fragments, it is safe to say that Android will not keep them all in memory. It may well retain state for individual fragments yet from my brief experimentation this does not seem to be the case.

One small hiccup

There does seem to be one small hiccup with this approach - namely that the FragmentStatePagerAdapter does not maintain the correct references to its adapter instance. That is to say if you set the adapter of the pager and then simply clear its cache prior to popping back to it, you will not see the result that you expect. This happens because the adapter has its own copy of the old adapter.

So as to not have to create a new adapter instance you simply need to set the adapter on the pager again. I use the following in onCreateView:

if (adapter == null) {  
            adapter = new PhotoAdapter(getChildFragmentManager());
        }
        adapter.clearCache();
        pager.setAdapter(adapter);

The long and the short

The long and the short of it seems to be that in providing a clever caching mechanism for optimal memory management, the Android team have made implementing a relative simple requirement somewhat complex. Memory management is however superbly important, and I am just happy that I dont have to implement the FragmentStatePagerAdapter concept in its entirity :)

Hopefully this will help someone trying to achieve a similar thing.

Android - pull to refresh, and scroll to load more

Over the last couple of days I have been investigating the pull to refresh paradigm, and its possible implementation in an Android app.

A quick Google search produces this library from Chris Banes, and this fork from Naver Communications.

They are however for my use cases too complex and not customizable enough for what I want. (I appreciate that they are both extremely customizable.. just not in the ways that I need ;))

The latest versions of the Google support library include the SwipeRefreshLayout (documentation link) which although 'official' does not do the entirity of what I require. Whilst I could extend and adapt this in-built widget, from playing with the various implementations I had found across the web, I felt that implementing it myself would be the best course of action.

I like knowing exactly what my code does and how it works. Writing it myself allows this.. whilst also removing the bloat of larger overly complex libraries.

What I wanted

I wanted to extend my ListView so that I could 'pull' down at the top to refresh, and scroll down at the bottom to load more data.

I wanted my actions to be explicit, so unless you pull down enough or scroll down enough the actions wont be executed and the list will animate back.

Conceptually..

On a conceptual level what i want to do is:

  • Have a list header which will contain the refresh control

  • Have a list footer which will contain the footer control

  • When the user scrolls the list view, work out if they are scrolling either of these views into view

  • If they are, on stopping scrolling execute one of the following actions:

    • animate the control back off the screen if the user has not scrolled far enough
    • animate to the top/bottom of the control respectively, update its display, and begin the background task associated with refreshing/loading more if they have scrolled far enough
  • If the former, no further action is taken

  • If the latter, once the background task has completed the control should animate back off the screen

Open source

Intriguingly, and somewhat suprisingly there seems to be an extreme lack of commenting across open source code. If you were looking at some of these projects you may be a bit flummoxed as to what a code block is doing and why..

This in my opinion is a real issue within the Android ecosystem whereby API differences mean that many code bases are built in certain ways for the APIs they were targetting at the time.

For example I had a look at this library by erikwt which does some intriguing things with animations for reasons that are not immediately clear..

That said, I am not complaining. Open source is great and I'd rather people share their code than keep it to themselves :)

Execution

The main conceptual premises behind the working of this code are as follows:

Is it visible

We determine if the header or footer is visible by utilizing the getFirstVisiblePosition() and getLastVisiblePosition() methods of ListView and maintaining control state using enumerations.

Header/Footer padding

The ability to scroll past the top/bottom of the list view is achieved by adding top/bottom padding to the respective control views in amounts that are relative to the amount the user has scrolled.

Animating padding

The animation of padding properties is achieved by proxying the padding property of these control views into a form that can be used by an ObjectAnimator (docs). The docs (here) state that 'The object property that you are animating must have a setter function'. Someone has kindly written a gist which illustrates how you can do this.

Here you go..

I will clean this up and release it as a project at some point. For now I have provided the full source of my PRListView as a gist.

I apologize for the colloquial language in my comments.. I was basically just outputting my internal dialog. None the less I hope commented code helps you get to grips with the code.

Thanks a lot to the library authors mentioned above whose code influenced the development of this extension.

Value. The data, or the packaging?

For one of my web based projects I needed to acquire some data. Data in the modern day and age is often the most valuable asset of a company... it got me thinking.

The Royal Mail here in the UK offer a Postcode Address File (PAF) to any entity who wishes to have it. The cost of this PAF file is many many thousands of pounds depending on exactly what you want and how you intend to use it.

Recently I was building a location based product. Using the PAF database was in no way financially viable. Instead I did a lot of research into alternative location groupings for the United Kingdom and gave consideration as to how I would acquire the data I needed to work with said groupings. I opted to utilize city, town, and village names grouped by their local government area, and I managed to accquire this data for a relatively insignificant sum.

My next requirement was to acquire data on the establishments that my website lists. My first approach was simply to manually input data - this gave me my first appreciation for the value of data. Were I to have continued to manually input this data my cost would have amounted to the value of my time. Utilizing the British minimum wage and extemely conservative time estimates a complete dataset would have cost me five figures.

Being an engineer I quickly begun investigating more intelligent ways of acquiring the required data. I ended up utilizing Facebooks API, and data scraping to begin the process of acquiring the data that I would need.

Facebook

Facebook seem to be pretty open with their data, and their API. Whilst to some extent you can put a cost on data acquisition, Facebook are in a position whereby sharing data is not going to have too much of a negative effect on them. It is actually more likely that Facebook will gain from my usage of a tiny subset of their data given the number of Facebook integrations across the product.

I pulled data from Facebook, but in the interest of producing a quality product I built functionality to allow me to manually process this data. Facebook allowed me to spend less time collecting data and thus reduced my cost.

Scraping

Having over a long period gradually integrated the data I had obtained from Facebook, I had the shocking realisation that the data was suprisingly incomplete. In this particular niche, many companies run a large number of establishments. As such I decided to scrape further data from the websites of these companies to try and fill the blanks.

What I noticed from this process is that in this particular niche, companies do not really value their data.. and that is somewhat understandable. I want their data so that I can essentially market them. On consideration I am suprised they did not make their data more easily accessible.

I found:

  • a few companies whose data was consistently formatted and easy to extract

  • a few companies who had placed the required data across multiple pages. Annoying but manageable.

  • A few companies using clean JSON backends

  • One company that really needs to hire a new web developer

My approach

Given that this was a one time thing, I was not interested in writing perfect, clean, well tested coded. I built a basic scraper using PHPs CURL functions, opened up the various pages and pulled out the data I needed from the source code.

Extracting the data essentially amounted to:

  • analysing the source code and working with PHPs DOMDocument

  • calling json_decode.

Now what

At this present point in time I have collected a sufficient amount of data to make my product viable. It has enough data to make it useful, and the public seem to agree. My hope is that users of the product will now contribute to the continued growth and general improvement of the dataset.

Given that I have spent a large amount of time an effort collecting, and moderating the dataset I have to some extent come full circle. This dataset now adds significant value to my product. I have done a lot of the heavy lifting and so now surely I want to stop others from scraping my data? Nope..

You cannot stop scraping

The long and the short of it is that if you want to provide your data to a good person (a site visitor or a search engine), then you have to provide it to the bad people too. The web is inherently open and free - you can either share your data or take your website down.

I was able to scrape all the data that I wanted using CURL. It was somewhat tedious, but it was not hard. This is probably the simplest tool in a scrapers toolkit.

You can try and obfuscate your source code, hide you data behind complex authorization procedures etc but this will only hurt you.

Google is a scraper.. as is Bing. If you want your website to be search engine friendly then it is also going to be bad person friendly.

Things likes capchas and required logins ruin the user experience.

Headers can be faked, IPs bought and so on.

Even considering the above, there are headless browsers like PhantomJS which are browsers. Phantom Mochachino, a tool which I built for testing demonstrates how powerful headless browsers are #shamelessplug. A scraper can use a headless browser in exactly the same manner that a normal user uses their browser.

You can make things harder, but anyone who is committed enough can, and will get your data.

It is for that reason I opted not to attempt to prevent scraping. Rather I made my API openly accessible to all.

Think about it

Wikipedia is a really large database, yet you don't see many Wikipedia clones ranking highly in Google. Even if they did you would more than likely click on the Wikipedia link as opposed to the http://rubbish-wikipedia-clone.com link.

Likewise with Stack Overflow - in this case there are loads of clones, but again I don't think I have ever used one and I highly doubt that they have any effect of Stack Overflow's visitor numbers.

Packaging

All things considered whilst data is extremely valuable the packaging of it is in many respects more important.

In my case my product is premised around helping companies within the niche improve visitor numbers whilst providing the general public with a useful and informative resource. I want to incentivize mutual cooperation and incetivize users to contribute to the site. I believe that more people will contribute if they know what we are doing with their data - we are making it freely accessible to anyone who wants to use it and help the industry grow.

So to answer the question.. for the particular dataset that I am working with, both have value. At least I hope they do.