Double Negative

Software, code and things.

Android - pull to refresh, and scroll to load more

Over the last couple of days I have been investigating the pull to refresh paradigm, and its possible implementation in an Android app.

A quick Google search produces this library from Chris Banes, and this fork from Naver Communications.

They are however for my use cases too complex and not customizable enough for what I want. (I appreciate that they are both extremely customizable.. just not in the ways that I need ;))

The latest versions of the Google support library include the SwipeRefreshLayout (documentation link) which although 'official' does not do the entirity of what I require. Whilst I could extend and adapt this in-built widget, from playing with the various implementations I had found across the web, I felt that implementing it myself would be the best course of action.

I like knowing exactly what my code does and how it works. Writing it myself allows this.. whilst also removing the bloat of larger overly complex libraries.

What I wanted

I wanted to extend my ListView so that I could 'pull' down at the top to refresh, and scroll down at the bottom to load more data.

I wanted my actions to be explicit, so unless you pull down enough or scroll down enough the actions wont be executed and the list will animate back.

Conceptually..

On a conceptual level what i want to do is:

  • Have a list header which will contain the refresh control

  • Have a list footer which will contain the footer control

  • When the user scrolls the list view, work out if they are scrolling either of these views into view

  • If they are, on stopping scrolling execute one of the following actions:

    • animate the control back off the screen if the user has not scrolled far enough
    • animate to the top/bottom of the control respectively, update its display, and begin the background task associated with refreshing/loading more if they have scrolled far enough
  • If the former, no further action is taken

  • If the latter, once the background task has completed the control should animate back off the screen

Open source

Intriguingly, and somewhat suprisingly there seems to be an extreme lack of commenting across open source code. If you were looking at some of these projects you may be a bit flummoxed as to what a code block is doing and why..

This in my opinion is a real issue within the Android ecosystem whereby API differences mean that many code bases are built in certain ways for the APIs they were targetting at the time.

For example I had a look at this library by erikwt which does some intriguing things with animations for reasons that are not immediately clear..

That said, I am not complaining. Open source is great and I'd rather people share their code than keep it to themselves :)

Execution

The main conceptual premises behind the working of this code are as follows:

Is it visible

We determine if the header or footer is visible by utilizing the getFirstVisiblePosition() and getLastVisiblePosition() methods of ListView and maintaining control state using enumerations.

Header/Footer padding

The ability to scroll past the top/bottom of the list view is achieved by adding top/bottom padding to the respective control views in amounts that are relative to the amount the user has scrolled.

Animating padding

The animation of padding properties is achieved by proxying the padding property of these control views into a form that can be used by an ObjectAnimator (docs). The docs (here) state that 'The object property that you are animating must have a setter function'. Someone has kindly written a gist which illustrates how you can do this.

Here you go..

I will clean this up and release it as a project at some point. For now I have provided the full source of my PRListView as a gist.

I apologize for the colloquial language in my comments.. I was basically just outputting my internal dialog. None the less I hope commented code helps you get to grips with the code.

Thanks a lot to the library authors mentioned above whose code influenced the development of this extension.

Value. The data, or the packaging?

For one of my web based projects I needed to acquire some data. Data in the modern day and age is often the most valuable asset of a company... it got me thinking.

The Royal Mail here in the UK offer a Postcode Address File (PAF) to any entity who wishes to have it. The cost of this PAF file is many many thousands of pounds depending on exactly what you want and how you intend to use it.

Recently I was building a location based product. Using the PAF database was in no way financially viable. Instead I did a lot of research into alternative location groupings for the United Kingdom and gave consideration as to how I would acquire the data I needed to work with said groupings. I opted to utilize city, town, and village names grouped by their local government area, and I managed to accquire this data for a relatively insignificant sum.

My next requirement was to acquire data on the establishments that my website lists. My first approach was simply to manually input data - this gave me my first appreciation for the value of data. Were I to have continued to manually input this data my cost would have amounted to the value of my time. Utilizing the British minimum wage and extemely conservative time estimates a complete dataset would have cost me five figures.

Being an engineer I quickly begun investigating more intelligent ways of acquiring the required data. I ended up utilizing Facebooks API, and data scraping to begin the process of acquiring the data that I would need.

Facebook

Facebook seem to be pretty open with their data, and their API. Whilst to some extent you can put a cost on data acquisition, Facebook are in a position whereby sharing data is not going to have too much of a negative effect on them. It is actually more likely that Facebook will gain from my usage of a tiny subset of their data given the number of Facebook integrations across the product.

I pulled data from Facebook, but in the interest of producing a quality product I built functionality to allow me to manually process this data. Facebook allowed me to spend less time collecting data and thus reduced my cost.

Scraping

Having over a long period gradually integrated the data I had obtained from Facebook, I had the shocking realisation that the data was suprisingly incomplete. In this particular niche, many companies run a large number of establishments. As such I decided to scrape further data from the websites of these companies to try and fill the blanks.

What I noticed from this process is that in this particular niche, companies do not really value their data.. and that is somewhat understandable. I want their data so that I can essentially market them. On consideration I am suprised they did not make their data more easily accessible.

I found:

  • a few companies whose data was consistently formatted and easy to extract

  • a few companies who had placed the required data across multiple pages. Annoying but manageable.

  • A few companies using clean JSON backends

  • One company that really needs to hire a new web developer

My approach

Given that this was a one time thing, I was not interested in writing perfect, clean, well tested coded. I built a basic scraper using PHPs CURL functions, opened up the various pages and pulled out the data I needed from the source code.

Extracting the data essentially amounted to:

  • analysing the source code and working with PHPs DOMDocument

  • calling json_decode.

Now what

At this present point in time I have collected a sufficient amount of data to make my product viable. It has enough data to make it useful, and the public seem to agree. My hope is that users of the product will now contribute to the continued growth and general improvement of the dataset.

Given that I have spent a large amount of time an effort collecting, and moderating the dataset I have to some extent come full circle. This dataset now adds significant value to my product. I have done a lot of the heavy lifting and so now surely I want to stop others from scraping my data? Nope..

You cannot stop scraping

The long and the short of it is that if you want to provide your data to a good person (a site visitor or a search engine), then you have to provide it to the bad people too. The web is inherently open and free - you can either share your data or take your website down.

I was able to scrape all the data that I wanted using CURL. It was somewhat tedious, but it was not hard. This is probably the simplest tool in a scrapers toolkit.

You can try and obfuscate your source code, hide you data behind complex authorization procedures etc but this will only hurt you.

Google is a scraper.. as is Bing. If you want your website to be search engine friendly then it is also going to be bad person friendly.

Things likes capchas and required logins ruin the user experience.

Headers can be faked, IPs bought and so on.

Even considering the above, there are headless browsers like PhantomJS which are browsers. Phantom Mochachino, a tool which I built for testing demonstrates how powerful headless browsers are #shamelessplug. A scraper can use a headless browser in exactly the same manner that a normal user uses their browser.

You can make things harder, but anyone who is committed enough can, and will get your data.

It is for that reason I opted not to attempt to prevent scraping. Rather I made my API openly accessible to all.

Think about it

Wikipedia is a really large database, yet you don't see many Wikipedia clones ranking highly in Google. Even if they did you would more than likely click on the Wikipedia link as opposed to the http://rubbish-wikipedia-clone.com link.

Likewise with Stack Overflow - in this case there are loads of clones, but again I don't think I have ever used one and I highly doubt that they have any effect of Stack Overflow's visitor numbers.

Packaging

All things considered whilst data is extremely valuable the packaging of it is in many respects more important.

In my case my product is premised around helping companies within the niche improve visitor numbers whilst providing the general public with a useful and informative resource. I want to incentivize mutual cooperation and incetivize users to contribute to the site. I believe that more people will contribute if they know what we are doing with their data - we are making it freely accessible to anyone who wants to use it and help the industry grow.

So to answer the question.. for the particular dataset that I am working with, both have value. At least I hope they do.

Social implementations of three-legged oAuth

I was implementing some functionality utilizing the Twitter API and found the documentation to be extremely lacking/unclear.

Given that the oAuth protocol seems more complicated than it actually is, I thought I'd document some extra explanation to accompany Twitter's sign in implementation docs.

3 Legged oAuth

Twitter's explanation of 3 Legged oAuth is rubbish. Full stop.

Let me briefly try and explain what is is because there seem to be very few resources that explain it well.

It is called three legged oAuth because there are three participants: the User, Website, and Service.

That said the process can also be considered as being made up of three stages as follows:

Leg 1

  • User wants to provide Website with data from Service
  • Website tells Service that it wants said data.

Leg 2

  • Website sends User to Service
  • User authorizes data request.
  • Service sends User back to Website

Leg 3

  • Website requests access token from Service

Why?

Website can access User's data from Service without knowing User's username and password for Service

Overview

Twitter's implementation of those three legs is as follows:

request_token

The request_token step of the oAuth process is essentially telling Twitter who you are and what you want.

You as the consumer pass in your consumer_key and consumer_key_secret - this indicates who you are.

Twitter then knows what permission you want: read only; read and write; or read, write, and direct message access - you have set these in your apps settings.

You can also pass in an oauth_callback header - this tells Twitter where to send your user once authorization is complete. If you don't pass this header Twitter will redirect users to the callback URL set in your settings. I find it worthwhile explicitly setting this on each request such that you can swap out different callback URLs for development.

I encountered an issue with the oauth_callback header which was resulting in me receiving the error message 'Failed to validate oauth signature and token'. If I did not pass in an oauth_callback header I would receive a token without issue.

My problem was with my oauth signature. It required that my oauth_callback was encoded twice.

As is typical, it is significantly easier to find out about an issue when you know what that issue. A post-fix search presented me with this stack overflow answer which extremely succinctly explains why it needs to be encoded twice.

Another important consideration when creating your oAuth signature is to order your parameters alphabetically when you are combining them to build a signature.

oauth/authenticate - oauth/authorize

Your user should be sent to one of these endpoints passing the request_token returned by the previous step. Here they will authorize your application to access/not access their data.

The former endpoint will automatically redirect an already authorized user to the oauth_callback url specified in the previous step. oauth/authorize requires authorization each time.

On completion the user is redirected to your oauth_callback. An oauth_token and oauth_verifier parameter are passed back if the user has authorized your app.

oauth/access_token

Finally, you POST the oauth_verifier to this endpoint and an oauth_token/oauth_token_secret will be returned. If you write these down ;) or store them in a database, you can access the users protected data without the above process until the tokens expire.

Reliance on Libraries

On a previous project I utilized abraham's PHP twitter library. I remember having no issues with it, but it was very much the case that I was unaware of its internals.

This time I figured I would read up on oAuth and make sure i knew exactly what my code was doing. Whilst browsing i came across j7mbo's twitter wrapper which is a very barebones PHP wrapper for the Twitter API. The beauty of it is that:

  • It works
  • It is simple, easy to understand, and only 300 lines long

There is no point uneccesarily reinventing the wheel so I took this wrapper and extended it to my needs.

Because of the simplicity of the code it was relatively painless for me to debug the double encoding issue with oauth_callback that I outlined above.

Comparison with the Facebook API

If you are implementing functionality that utilizes the Twitter API, the chances are that you are implementing something more generically social. I was - I also needed to work with the Facebook API.

There are a few similarities between the APIs:

  • They both utilize oAuth
  • The both have rubbish documentation

That said I found myself enjoying working with the Facebook API to a greater extent.. (is that weird?).

Firstly Facebook provides an official PHP SDK - in my mind I can be relatively confident that it is well tested, and works. It is a significantly more complex wrapper than the Twitter wrapper mentioned above but one can assume that a company like Facebook would religiously test their SDKs.

In the same way that when testing your own products you assume that external APIs work (or you don't use them), I am happy to assume that the SDK works.

What I like about this wrapper is the helper functions. They provide a FacebookRedirectLoginHelper class which means that to get the login URL to which I redirect the user (to get their authorization), I simply need to call the getLoginUrl method.

On top of that, responses are wrapped up in an object orientated interface so I can get my access token by calling $session->getToken(); and can get properties of the graph objects contained within my response using $graphObject->getProperty('email');

That wasn't so oAuthful..?

So there you have it.. a basic overview of oAuth and its utilization by two social powerhouses.

It has now gotten to a point where I have implemented these APIs so many times that it is somewhat second nature.

The beauty of the oAuth protocol is that once you have a grasp on what it is, and how it works it is clear to see why it is so powerful.

If you want to learn more check out the oAuth website.

Now I'm going to ponder implementing oAuth security into one of my publicly available APIs that really has zero use for it ;)

The UIViewController: Actual Lifecycle and Acceptable Heirachy

I am working on an iOS app for a product that I have been building. Throughout the process I have come up against some hurdles and have sought to resolve them using the (fantastic) knowledge base that is Stack Overflow.

Moreso than when writing code for any other platform I have found that Stack Overflow answers pertaining to Objective C/Swift are full of inaccuracies, are misleading, or are down right wrong. As such I have spent a lot more time investigating issues myself and working out exactly why things happen and how things work.

Apple has extremely good documentation of its APIs, and application guidelines. What confuses me somewhat is the fact that they have not taken the time to write in depth expanations of areas that might not be so obvious and areas that are often discussed and debated.

Given that a lot of the internals of Apple's APIs are private and one cannot simply look for an answer I think this is something they should invest some time in.

Recently I have encountered a number of considerations relating to the heirachy and lifecycle of UIViewControllers.

Issue - UITabBarController in the heirachy

Why exactly does your UITabBarController have to be the root controller? If you read the UITabBarController API documentation it clearly states When deploying a tab bar interface, you must install this view as the root of your window. Why is this?

Using XCode 6 and iOS 8 I embedded a UITabBarController as a child at numerous levels of the heirachy without issue. I am aware that in previous versions you could not.. but as things stand, you can. It would thus seem that at the moment the only reason not to do this is because Apple says not to.

Hands on

In the app that I am building I wanted to have tab bar navigation at the base of the application. In each tab various controls would allow you to open other views which I also wanted to contain independent tabbed navigation. This is not allowed (as outlined above).

After digging a bit I found out that actually it is.. The documentation states that It is possible (although uncommon) to present a tab bar controller modally in your app.

As the tab bar controller always acts as the wrapper for the navigation controllers each tab has to have its view controller embedded in a UINavigationController. Given that I want all the tabs to have the same navigation controls.. this is just annoying. Especially given that the docs state that you should embed only instances of the UINavigationController class, and not system view controllers that are subclasses of the UINavigationController class.

It is extremely unclear as to whether you are 'allowed' to use your own custom UINavigationController subclasses. My interpretation is that it is OK. If you are only doing small manipulations and are calling the respective super methods I can not see any reason why this would be an issue.

Issue - Why is viewWillAppear not consistently called?

What exactly is the UIViewController Lifecycle, and why does it vary under certain nesting circumstances?

For example viewWillAppear is not consistently called in a UIViewController nested in a UINavigationController displayed in a modal..

There is an example of a similar issue here. I dont personally reccomend you use this answer. What I do recommend is that if you have complex or 'uncommon' view heirachies you verify that the lifecycle methods you expect to be called are in fact called.

Issue - manipulating views based on resolved constraints.

Another intriguing issue is manipulating views when their sub views have been laid out. Apple does have the viewDidLayoutSubviews method, but again it is unclear exactly when this method is called. The documentation states this method being called does not indicate that the individual layouts of the view'€™s subviews have been adjusted. Each subview is responsible for adjusting its own layout. - this can lead to some interesting considerations which i have outlined below.

Hands on

In my modally presented UITabBarController I have a UIViewController (nested in a UINavigationController) in which i want to lay out some buttons based on the space available to me when my constraints have been resolved. To make things a little more complex, this is within a UIScrollView.

When my viewDidAppear method is called, my constraints have been resolved. Unfortunately however positioning and adding subviews here will at a minimum cause some flickering as they are displayed. This is not acceptable.

viewDidLayoutSubviews is called at undocumented times. I found from testing that the viewDidLayoutSubviews was in fact called twice, and that after the first call the subviews of my UIScrollView were in fact not layed out. Only after the second execution were all my constraints resolved.

I have no interest in doing any complex error prone conditionals such as calculate and add subviews the second time viewDidLayoutSubviews is called. As such I decided the most definitive way of knowing when my scroll views subviews had been layed out was by creating a custom subclass of UIScrollView and overriding its layoutSubviews method.

The actual view controller lifecycle for my setup is listed below. The frame size is also noted.

  • viewWillAppear (0.0,0.0,320.0,568.0)
  • The layoutSubviews method of my base view(0.0,0.0,320.0,568.0)
  • viewDidLayoutSubviews (0.0,0.0,320.0,568.0)
  • the layoutSubviews method of my scroll view (20.0,426.0,280.0,200.0) correct resolved frame
  • The layoutSubviews method of my base view (20.0,426.0,280.0,200.0) again
  • viewDidLayoutSubviews (20.0,426.0,280.0,200.0)
  • viewDidAppear (20.0,426.0,280.0,200.0)

The important thing to note here is that you cannot just assume that because viewDidLayoutSubviews has been called that all your constraints have been resolved. The name is totally misleading, but its a private API and there is nothing we can do about it sadly.

Because the layoutSubviews method can also be called numerous times it is important to make sure you dont run complex process operations more often than necessary. In my case within layoutSubviews I have a simple check which verifies if my frame has changed since it was last processed. If it hasn't there is no need to re-process anything.

All things considered

After going to the effort to work out the above it hit me that my codebase was now significantly cleaner. I had seperated my concerns to a greater extent and it felt more MVC esque.

My manipulation of my views is now in a subclass of UIScrollView rather than in my UIViewController - my controller is now more targetted towards control and my view focussed on.. well.. the view.

I read somewhere in the Apple documentation that view manipulation from a UIViewController is perfectly acceptable. It is in the name really :) That said I find it incredibly intriguing that the way Apple has build its product and presented it to developers inherently results in what I believe to be better designed code bases.

I have learned a lot because Apple's codebase is private. Some more documentation would still be appreciated :)