1

I am trying to design database for this case:

  1. Assignments have vectors, relation is 1:N
  2. Assignments have submissions, relation is 1:N
  3. Submissions have executions, relation is 1:N
  4. Every execution have one vector.

Database EER diagram

Business logic

  1. teacher creates assignments and defines test vectors
  2. student upload his solution, so record in submissions is created
  3. After successful compilation of submission, submission is executed with defined test vectors. Each execution is one record in executions (One execution per vector).

So circular reference is created after successful execution, but if compilation failed no record is created in executions. Link between vectors and executions is needed for score calculation process, where is reference output from vectors compared to output from executions.

So in my case, circular reference is not persistent, but it depends on runtime, so it is wrong design?

asked Mar 22, 2014 at 14:39

1 Answer 1

0

The tricky part is that relational databases don't care about the direction of the relationship or circular dependencies. In the object model we do care.

Define your "has a" relationships, and then abstract one of the classes with an interface to break the circular dependency. For example:

  • Assignment aggregates TestVector (i.e. has a collection of)
  • Submission has a Assignment
  • Execution has a Submission
  • Execution aggregates ExecutionStep (interface)
  • Vector is a ExecutionStep (interface)

See how Execution refers to an abstract ExecutionStep rather than a concrete Vector? That breaks the circular dependency, because now you can define other things for the execution to run without changing the object model.

(However, looking at it now, you don't need the foreign key from execution to vector, because you can get that with a join of Execution -> Submission -> Assignment -> Vector.)

...

When you write the classes to model these tables, then you have a circular code dependency, which is bad. So you can break the code dependency with an interface, but that's a separate issue from the database design.

In the database design, the reason Execution -> Vector is problematic is that it duplicates Execution -> Submission -> Assignment -> Vector. So you'd have to make sure those stay in sync.

Unless:

  • you're duplicating the data as a performance optimization & know the risks

  • you want Execution to reflect the vectors involved when it ran

That is, let's say an Assignment adds a new Vector. Any existing Executions won't see that new vector, because they already point to their list of vectors. But that may be a good thing, because it's saying "When this Execution ran, it used these Vectors, even though an Assignment may have added or removed Vectors since then".

I was thinking these artifacts wouldn't change; that they are all immutable. But that may be a bad assumption. So I could be wrong -- the reference from Execution to Vector may not be redundant. You'd know the answer to that better than me.

answered Mar 22, 2014 at 14:58
7
  • Thanks for answer. Yes join is possible but it is good practice to join 3 tables? With ExecutionStep you mean table m:n like vector_id, execution_id? thanks Commented Mar 22, 2014 at 15:28
  • execution -> submissions -> assignments are all just index lookups, so those are fast. The only outer join is from assignments -> vectors. In other words, the only join that "blows up" the result set is the last one. The other two are just lookups. In a big production system you might have joins across dozens of tables. But if you need performance, you might not want to have any joins at all. In this case, since the execution is doing offline processing, it looks OK. Commented Mar 22, 2014 at 15:35
  • You could break it up into two if your object-relational mapping made it messy. Go Execution -> Submission -> Assignment to resolve the original assignment, and then Assignment -> Vector to get the list of vectors. But I'd probably just do it in one shot unless it got too messy. Commented Mar 22, 2014 at 15:38
  • Sorry -- I missed the second part of your question. I ammended the answer to pick that up. :) Commented Mar 22, 2014 at 15:55
  • It should work as you have written "...it's saying "When this Execution ran, it used these Vectors, even though an Assignment may have added or removed Vectors since then"." Assumption is that submission was executed, with this set of vectors. When something change in vectors to selected assignment it creates new submission ... but one execution has only one vector. Commented Mar 23, 2014 at 12:32

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.