In the following blog I wanted to document encountering a problem on the Alteryx Community Discussion Boards, discuss how I went about solving it and reflect on the experience.
Last night I was browsing the Alteryx Community discussion boards hoping to log a first career problem solved to my account. I stumbled upon a curious challenge that no-one had offered a solution for. The user had a dataset of 500 rows, some had a groupID, others had Null. The user wanted a way to assign the Nulls to groups but wanted each group to be 20 rows. An additional requirement is that the cases be assigned in ID order, so the lowest rowID assigned first to the lowest group with available space.
The user's original thoughts on the problem was some kind of conditional multi-row formula to keep adding the same ID until 20 exist. I think this logic might have worked had no groups been assigned to that point. Ordering by row ID and having an ID copy for 20 rows before moving on. With groups already assigned I could not see a way to ensure we were assigning correctly so went in a different direction.
![](https://www.thedataschool.co.uk/content/images/2023/04/image-175.png)
My first step was to split the data into assigned and unassigned. For the assigned flow I summarized the data to work out how many groups we had and how many IDs were in them. I then used a row-level formula of 20 - count of IDs to compute spaces remaining. (NOTE this solution does assume we have at least 1 person assigned to each group in the dataset - although unioning on additional groups would be a workaround)
![](https://www.thedataschool.co.uk/content/images/2023/04/image-176.png)
I then worked on the unassigned flow, sorted by rowID, I thought I might be able to sample x rows and give them the appropriate GroupID. I realized that I would want to do this for each group seperately and union the results. Therefore a batch macro made sense with the summarized table mentioned above as the control parameter.
![](https://www.thedataschool.co.uk/content/images/2023/04/image-177.png)
The macro itself had two update values - the spaces remaining value from the summarized table updated the sample top N while the groupID of that table updated the formula tool with the GroupID to assign.
The macro worked as expected and all that was left was to union the already assigned cases and do a quick summarize to count GroupIDs and verify that each group had 20 cases as expected.
![](https://www.thedataschool.co.uk/content/images/2023/04/image-178.png)
![](https://www.thedataschool.co.uk/content/images/2023/04/image-179.png)
Overall I was really satisfied with the solution, however, I am yet to be credited with a solve as it is up to the author of the post to bestow this honour - after all I might think I have solved his problem but without a consulting setting where I can question what they want and why I might not have solved his actual problem. Hopefully the solution I provided did help and that a solve designation is on the way.