You're working with a large database of employee records. For the purposes of this question, we'll picture the database as a two-dimensional table T with a set R of m rows and a set C of n columns; the rows correspond to individual employees, and the columns correspond to different attributes.
To take a simple example, we may have four columns labeled
name, phone number, start date, manager's nameand a table with five employees as shown here.
name | phone number | start date | manager's name |
Alanis | 3-4563 | 6/13/95 | Chelsea |
Chelsea | 3-2341 | 1/20/93 | Lou |
Elrond | 3-2345 | 12/19/01 | Chelsea |
Hal | 3-9000 | 1/12/97 | Chelsea |
Raj | 3-3453 | 7/1/96 | Chelsea |
Given a subset S of the columns, we can obtain a new, smaller table by keeping only the entries that involve columns from S. We will call this new table the projection of T onto S, and denote it by T[S]. For example, if S = {name, start date}, then the projection T[S] would be the table consisting of just the first and third columns.
There's a different operation on tables that is also useful, which is to permute the columns. Given a permutation p of the columns, we can obtain a new table of the same size as T by simply reordering the columns according to p. We will call this new table the permutation of T by p, and denote it by Tp.
All of this comes into play for your particular application, as follows. You have k different subsets of the columns S1, S2,...,Sk that you're going to be working with a lot, so you'd like to have them available in a readily accessible format. One choice would be to store the k projections T[S1], T[S2],T[Sk], but this would take up a lot of space. In considering alternatives to this, you learn that you may not need to explicitly project onto each subset, because the underlying database system can deal with a subset of the columns particularly efficiently if (in some order) the members of the subset constitute a prefix of the columns in left-to-right order. So, in our example, the subsets {name, phone number} and {name, start date, phone number,} constitute prefixes (they're the first two and first three columns from the left, respectively); and as such, they can be processed much more efficiently in this table than a subset such as {name, start date}, which does not constitute a prefix. (Again, note that a given subset Si does not come with a specified order, and so we are interested in whether there is some order under which it forms a prefix of the columns.)
So here's the question: Given a parameter l < k, can you find l permutations of the columns p1,p2,...,pl so that for every one of the given subsets St (for i = 1, 2, … k), it's the case that the columns in St constitute a prefix of at least one of the permuted tables Tp1, Tp2,Tpi? We'll say that such a set of permutations constitutes a valid solution to the problem; if a valid solution exists, it means you only need to store the l permuted tables rather than all k projections. Give a polynomial-time algorithm to solve this problem; for instances on which there is a valid solution, your algorithm should return an appropriate set of l permutations.
Example. Suppose the table is as above, the given subsets are
S1 = {name, phone number},
S2 = {name, start date},
S3 = {name, manager's name, start date},
and l = 2. Then there is a valid solution to the instance, and it could be achieved by the two permutations
p1 = {name, phone number, start date, manager's name},p2 = {name, start date, manager's name, phone number}.This way, S1 constitutes a prefix of the permuted table Tp1, and both S2 and S3 constitute prefixes of the permuted table Tp2.
We need at least 10 more requests to produce the solution.
0 / 10 have requested this problem solution
The more requests, the faster the answer.