Model-based testing relies on models of a SUT and its environment. To define a notion of "interesting" test cases, test selection criteria need to be defined. Structural criteria lend themselves to automated generation and may be required by development standards. Yet, when used as test selection criterion, coverage usually does not correlate with failure detection. I'll motivate the need to overcome the a-priori use of coverage criteria for test selection by presenting one conceptualization from the literature. Using this conceptualization, one can show that in general, partition-based testing - which relies on a partition of a program's input domain for test selection and which therefore subsumes coverage-based test selection - can be better, the same, or worse than random testing. This questions the very idea of partition-based testing in the typical situation where increased a-priori likelihoods of some of the blocks in the input domain partition cannot be assumed. On these grounds, I'll present one approach to generating tests via model-based flaw injection. These tests (1) targeted security properties and attacks such as cross site scripting attacks rather than structural criteria, and (2) are turned into implementation-level tests that are run in a semi-automatic way and that often can reproduce attacks at the implementation level.